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Half way to real reform 


Universities in Germany have undertaken overdue reform, but more 


change is needed to fully tap their potential. 


development, and its overall scientific impact suggests that 

much of the money is well spent. But a great deal of that 
impact comes from the 80 institutes of the Max Planck Society. The 
university sector is underperforming (see page 630). 

The reasons for this can be traced back to the country’s turbulent 
twentieth-century history and the ideologies that invaded the univer- 
sities before and after the Second World War, on both sides of the Iron 
Curtain. By the 1990s, universities were overpopulated with students 
that they had not themselves selected, underfunded, and hide-bound 
by rules preventing them from competing with each other. 

These problems have been recognized for a while, and other Euro- 
pean countries may learn from Germany’s response. Reforms imple- 
mented during the past few years have given the universities much 
more control over their own destinies, sending shockwaves through 
the academic landscape. For example, universities may now offer 
competitive salaries and conditions to selected researchers by trans- 
ferring support from less productive colleagues. 

To encourage institutions that are reluctant to make the most of 
their new freedoms, research organizations have launched competi- 
tions that highlight which universities are doing well and which badly. 
Perhaps the most influential has been the federal research ministry's 
Excellence Initiative, which selects a handful of élite universities. 

All of this makes for a good start on university reform, but there is 
a long way to go. Visitors to German universities are unlikely to see, 
for example, the diversity of gender, age and nationality that they 
would encounter in a typical US research university. The number of 
female professors remains dismally small. New initiatives to increase 
the number of young professors have so far made only a small dent 
in academic demographics. 

And Germany remains less attractive to young foreign scientists 
than it ought to be. The latest figures from the European Commis- 
sion’s Marie Curie programme, which funds young European Union 
(EU) researchers to work in a second EU country, show that only 
11.5% choose to go to Germany — hardly changed from five years ago 
and still well below France (16%) and the United Kingdom (32%). 


G ermany is the world’s fourth largest investor in research and 


One reason for Britain’s popularity is language — English is already 
widely spoken and a few years in an institution where it is the working 
language helps a scientific career. But the fact remains that German 
universities could do more to create a receptive environment for for- 
eign students and staff. 

It will be some time before the positive impact of the reforms 
undertaken so far shows up in statistics. In the meantime Germany 
needs to address a few extra problems that have been either caused 
or highlighted by the reforms themselves. 

As efforts concentrate on building up a young faculty, the tra- 
ditional position of the low-level academic, the Mittelbau, whose 
nearest equivalent is perhaps the assistant professor, is disappear- 
ing. The heavy teaching load that these “Garman universities 
people used to bear now falls on young 

_. could do more to 
professors who ought to be devoting : 
themselves to research. This isahard Create areceptive 
conflict to resolve, as the teaching is environment for 
equally important — but recruitment forej gn staff.” 
must be broadened to address it. 

Additionally, many universities are still loath to appoint tenured 
professors from among their own junior staff. This principle was 
intended to avoid parochial appointments, but it has become less 
necessary in the current era of constant evaluation, where there is a 
natural pressure to appoint the best candidate. The rule often serves 
as an obstacle to young researchers seeking a route to tenure. 

The universities will also benefit indirectly from the deal cut two 
years ago between federal and state governments, Germany’s non- 
university research institutes, and the DFG, the main grant-funding 
agency. In return for guaranteed 3% annual budget increases until 
2010, these institutes are expected, among other things, to encourage 
greater collaboration with both industry and research universities. 

This is a positive development for all concerned, giving institutes 
such as the Max Planck stable budgets and the universities better 
access to their resources. It is no coincidence that two of three uni- 
versities selected by the Excellence Initiative had already developed 
unusually tight links with local Max Planck institutes. a 


Bad execution 


China won't achieve a tenable drug regulation 
policy by hanging public officials. 


China's State Food and Drug Administration (SFDA), is a 
throwback to the nation’s ugly past that will do little to further 
its professed goal of building a fair drug-regulation regime. 


Te sentencing to death of Zheng Xiaoyu, the former head of 


Zheng was sentenced to death in a Beijing court on 29 May on 
charges of accepting bribes, two years after he he was sacked from 
the drug regulator. Given the secrecy of China's judicial process, it is 
difficult to assess his guilt or innocence. But accusations involving the 
bribery of hundreds of officials have shadowed the agency for years. It 
is good that the Chinese government is facing up to the problem and 
taking public steps to clean up its drug-regulation process. 

But hanging a man and vilifying him in state-controlled newspa- 
pers does not inspire confidence that China is building an effective 
drug-regulatory process. If the sentence is carried out, it is more likely 
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to confirm the pharmaceutical industry’s worst fears that there is lit- 
tle chance of doing business fairly in a country where the rule of law 
remains patchy and subject to political influence. 

Drug regulation is vitally important to China as it seeks to develop 
an internationally competitive drug industry of its own, while attract- 
ing investment from and collaboration with the rest of the world. 
The country rightly sees the establishment of such an industry as 
critical to both public health and the nurturing of innovation in the 
life sciences. In common with most other governments — but with 
rather better prospects of success — China regards the successful 
combination of research in biology and genetics, and innovation in 
biotechnology and pharmaceuticals, as an important element of its 
plans for scientific and economic development. 

From the global industry’s point of view, the establishment of a 
sound regulatory regime in China is just as important. The world’s 
leading drug companies see the country, with its burgeoning middle 
class, as a market of great potential. Yet participation in that market 
remains something of an enigma. All of the major drug companies 
have stepped tentatively into direct participation in research activities 
in China. They view the risks of investment in China as considerable, 
but the benefits will only reveal themselves if and when a reliable 
regulatory regime is established. 


So everyone in the industry, at home and abroad, supports the pro- 
fessed aims of Beijing's drive to eliminate widespread corruption from 
drug regulation. But they are entitled to be suspicious of its imple- 
mentation. Corruption has been widespread and no one believes that 
Zheng — supposing that the charges against him are proven — was 
the only, or even the worst, culprit. 


Articles appearing in China Daily “Regulation is vitally 


and elsewhere in condemnation of the important to China 
official and his family have the smell as it seeks to develop 
of old-fashioned, stalinist scapegoat- Bie 

a competitive drug 


ing, more likely to sweep the prob- 
lem under the carpet than resolve it. 
Genuinely fair regulation of drugs is a 
complex matter that depends on transparency and on sophisticated 
checks and balances — such as scientific staff who are paid by the 
government but can be seen to be independent — not on fear and 
arbitrary justice. 

Hanging a man may create the public impression that the problem 
is being zealously tackled. Real movement towards fair regulation 
would involve steps a great deal less melodramatic that yet seem 
beyond China’s grasp — steps towards a transparent drug-review 
process, functioning under open, public scrutiny. | 


industry of its own." 


Community service 


Introducing three free-access websites for research 
networking and outreach. 


Nature in 1869 and is reproduced every week on our printed 

table of contents may use archaically high-flown language, but 
it still applies. In essence, we exist to help scientists communicate with 
each other and to communicate science to wider audiences. 

Precisely that duality applies to two websites to be launched this 
week: Nature Reports Climate Change and Nature Reports Stem Cells. 
Aimed at researchers and at anyone else who is interested, both give 
an editorial perspective of their fields through a combination of 
original journalism and commissioned comment, alongside archived 
material from other Nature publications. Both sites also facilitate 
community interactions through blogs. 

For example, the climate-change site focuses on post-Kyoto agen- 
das, both journalistically and with an analysis of the obstacles by 
development expert Jeffrey Sachs (see www.nature.com/climate). 
The stem-cells site contains a similar blend of news about the latest 
research and comments, as well as a featured editor — this month, 
cloning researcher Ian Wilmut. It also goes behind the research 
papers with an editorial commentary and extracts from referees’ 
comments (with their permission) of the paper in this issue of Nature 
on developmental reprogramming by Egli et al. (see page 679 and 
www.nature.com/stemcells). 

These sites will develop further by way of community interactions 
and applications in the coming months. The original content of both 
is freely accessible. 

Also free is a very different website to be launched next week: 


Ts mission statement that appeared in the second issue of 
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Nature Precedings. As its title implies, this site will enable research- 
ers to share, discuss and cite their early findings. It provides a lightly 
moderated and relatively informal channel for scientists to dissemi- 
nate information, especially recent experimental results and emerg- 
ing conclusions. In this sense, it is designed to complement traditional 
peer-reviewed journals, allowing researchers to make informal com- 
munications such as conference papers or presentations more widely 
available and enabling them to be formally cited. This, in turn, allows 
them to solicit community feedback and establish priority over their 
results or ideas. 

Intended to cover biomedicine, chemistry and the Earth sciences, 
the site (http://precedings.nature.com) will host a wide range of 
research documents, including preprints, unpublished manuscripts, 
white papers, technical papers, supplementary findings, posters and 
presentations. All submissions will be reviewed by staff curators and 
accepted only if they are considered to be legitimate scientific contri- 
butions of likely interest to others in that field. No judgement is to be 
made about the quality or uniqueness of the work, and submissions 
are not subjected to peer review before they are released. Because of 
this, accepted submissions will usually be published within one work- 
ing day, and no charge is made to either authors or readers. 

Nature Precedings will make full use of participative features such 
as tagging, voting and commenting to facilitate the discovery of 
especially interesting and relevant content. We anticipate that the 
content will be mirrored by academic partner organizations, several 
of whom have been involved with us in developing this service. As 
well as allowing it to become incorporated into the substantial infor- 
mation hubs already provided by these organizations, this federated 
approach will also help to ensure the long-term availability of the 
content — and act as a practical guarantee of the Nature Publishing 
Group’s pledge not to charge readers for access. 2 
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Mouse model for malaria 


PLoS Pathog. 3, e72 (2007) 

A global collaboration has come up with the 
first non-primate animal model for human 
malaria — and on the way explained how a 
key antibody protects infected people. 

Richard Pleass of the University of 
Nottingham, UK, and his colleagues 
engineered an antibody that matches those 
found in Gambian adults who are immune to 
malaria. The antibody binds a Plasmodium 
falciparum protein known as MSP1,., an 
important vaccine target. The picture shows 
infected red blood cells. 

To test the antibody in mice, the researchers 
engineered the mouse malarial parasite 
Plasmodium berghei to express MSP1,9. 

They also made transgenic mice carrying a 
gene that encodes the human immune-cell 
receptor FcyR1, suspected to play a role in the 
antibodies’ protective effect. The antibodies 
protected only the transgenic mice from 
infection, confirming the importance of FcyRI1. 


G. GAUGLER/SPL 


CLIMATE SCIENCE 


Uncertain forecast 


Science doi:10.1126/science.1140746 (2007) 
Global warming could boost rainfall by more 
than double the amount predicted by current 
climate models, a new study suggests. 

Frank Wentz and his colleagues at Remote 
Sensing Systems in Santa Rosa, California, 
analysed weather-satellite data from 1987 to 
2006. Climate models project that worldwide 
rainfall will increase by between 1 and 3% 
per degree of warming, but the satellite data 
suggest rainfall will go up in line with the 
atmosphere’s water vapour content — at a 
rate of around 7% per degree. The models 
forecast less rain because they predict that 
weakening surface winds will reduce water 
evaporation. But the data show that surface 
winds actually increased with warming. 

It is, for now, unclear whether the 
discrepancy is due to flaws in the models 
or problems with the data. It is also unclear 
where the extra rain, ifit arrives, might fall 


al 


— whether it will make wet places wetter, or 
bring relief to drought-stricken regions. 


NANOTECHNOLOGY 


Spot the ball 


J. Am. Chem. Soc. 129, 6666-6667 (2007) 

The highly symmetrical atomic structure 
of the football-shaped Cg) molecule has 
been seen for the first time using electron 
microscopy. 

Kazu Suenaga and his co-workers at the 
National Institute of Advanced Industrial 
Science and Technology in Tsukuba, 

Japan, imaged individual C,, molecules 
tethered to the surface of carbon nanotubes. 
Comparisons between these images (example 
pictured below left) and image simulations 
(right) based on the molecules’ 20-sided cage 
structure (middle) allow the observed two- 
dimensional shapes to be assigned to various 
projections of the carbon shells. These are 
made from pentagonal and hexagonal rings. 

The researchers also see several distorted, 
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non-spherical shells that they assign to C;. 
molecules, formed when C, units are kicked 
out of the shells by the electron beam. 


COSMOLOGY 
When the Universe began 


Astrophys. J. 170 (Suppl.), 263-287; 288-334; 335- 
376 and 377-408 (2007) 

Four papers describing data from one of 
cosmology’s greatest experiments have been 
published, a year after the results were made 
public. 

The Wilkinson Microwave Anisotropy 
Probe was launched in 2001 to study the 
radiation left over from the inferno of the 
Universe's birth. By mapping temperature 
fluctuations in this cosmic microwave 
background and measuring the polarization of 
the radiation, the probe has provided evidence 
that the Universe is made up mostly of dark 
matter and dark energy. It has also shed light on 
aspects of the Universe’ history, such as when 
the first stars were born. The four papers report 
observations collected over three years. 


ACS 


NEUROSCIENCE 


Cellular angst 


Nature Neurosci. doi:10.1038/nn1919 (2007) 
Anxious people tend to expect the worst in 
uncertain situations — a trait that researchers 
have now linked in mice to particular cells 
within the brain’s hippocampus. 

Cornelius Gross of the European 
Molecular Biology Laboratory in 
Monterotondo, Italy, and his team studied 


mice genetically engineered for increased 
anxiety. These mice froze in fear for the same 
length of time when exposed to a cue — such 
as a flash of light — that was always followed 
by an electric shock as they did when exposed 
to a different cue that was only sometimes 
associated with a shock. Normal mice 
responded less strongly to the ambiguous 
than to the certain cue. 

The researchers showed that inhibiting 
the granule cells in the anxious mice’s 
hippocampal dentate gyrus restored normal 
behaviour. 


GEOLOGY 


Ancient lava fossils dated 


Geology 35, 487-490 (2007) 

Radiometric dating has confirmed that 
microscopic tubular structures in ancient 
lavas date back billions of years. These 
structures are thought to show that life thrived 
in volcanic rocks deep within the early Earth. 

Neil Banerjee of the University of 
Western Ontario in London, Canada, and 
his colleagues found the microfossils in 
pillow lavas in the Pilbara Craton of western 
Australia. The tubes contain traces of organic 
carbon, and appear identical to those left in 
basaltic rocks by modern microbes. 

The tubular structures at this site also 
contained the mineral titanite, which allowed 
them to be dated by measuring trace amounts 
of lead and uranium. This revealed the 
structures to be 3 billion years old. 


CANCER BIOLOGY 


The price of silence 


Cell 129, 879-890 (2007) 

Researchers have identified a possible genetic 
culprit behind one of the most common 
forms of adult leukaemia. 


Albert de la Chapelle and Christoph Plass 
of Ohio State University in Columbus and 
their colleagues found that reduced activity of 
a gene called death-associated protein kinase 
1 (DAPK1) is linked to both inherited and 
spontaneous forms of chronic lymphocytic 
leukaemia. DAPK] regulates programmed 
cell death in blood cells called lymphoid 
cells, and loss of that function could aid the 
leukaemia. 

In most cases of the disease, the gene had 
been silenced by the addition of methyl 
groups to the DNA region controlling 
DAPK!1 expression. But the team also 
discovered, in affected members of a family 
with a history of the leukaemia, a mutation 
that reduces DAPK1 expression. 


BIOCHEMISTRY 


Single handedly 


J. Am. Chem. Soc. doi:10.1021/ 
ja0708870 (2007) 

Why are all amino acids in living 
organisms left-handed? Donna 
Blackmond and her co-workers 
at Imperial College London, 

UK, suggest a new way in which 
crystallization could have played 
a part. 

The reactions expected to have 
produced amino acids on the early Earth 
generally create both left- and right-handed 
forms of the molecules. For some amino 
acids, such as serine, a tiny excess of one form 
can lead to that form’s preferential removal by 
precipitation of crystals, leaving the solution 
enriched in the other. 

Blackmond and her colleagues have 
now shown that this enrichment can be 
engineered for amino acids that don't display 
it by themselves. Small molecules such as 
dicarboxylic acids added to the mixture 


RESEARCH HIGHLIGHTS 


become incorporated into the crystals and 
promote the extraction of one form of the 
amino acid. 


MATERIALS SCIENCE 


Reflect on this 


Nature Mater. doi:10.1038/nmat1930 (2007) 


The first 
characterization ofa 
protein from the reflective skin 
of the Hawaiian bobtail squid 
(Euprymna scolopes, pictured 
above) has revealed the protein to 
have unusual optical properties and 
impressive self-assembly skills. 
Rajesh Naik and his colleagues at the Air 
Force Research Laboratory in Dayton, Ohio, 
engineered bacteria to produce the squid 
protein reflectin. They found that it has the 
highest refractive index — a measure of how 
slowly light travels through a material — 
reported for any naturally occurring protein. 
When deposited from a particular type of 
solvent, the reflectin proteins self-organized 
into a film of regularly spaced stripes, the 
separation of which could be tuned. This 
functioned as a diffraction grating, which 
splits light into its different wavelength 
components. 


JOURNAL CLUB 


Gautam R. Desiraju 
University of Hyderabad, India 


Achemist applauds an 
algorithm able to predict crystal 
structures from chemical 
composition alone. 


| work in crystal engineering, 

a field that involves designing 
and constructing crystals with 
desired physical, chemical or 
pharmaceutical properties from 
small organic molecules. It is 

an experimental science based 
on pattern recognition and 


retrosynthetic strategies, in which 
the structure is considered as the 
sum of smaller, simpler parts. 

Improvements to 
computational crystal-structure 
prediction could make design 
protocols more reliable. But this 
is sucha difficult problem that 
only a handful of groups in the 
field work on it. In this context, | 
found a recent paper presenting a 
seemingly reliable method to be 
thought-provoking (A. R. Oganov 
and C. W. Glass J. Chem. Phys. 
124, 244704; 2006). 

Typically, crystal-structure 
prediction involves computer 


generation of putative crystal 
structures using a force field, 
which represents the interactions 
between atoms in neighbouring 
molecules. The correct structure 
is presumed to be that which 
minimizes the crystal’s energy. 
The procedure is problematic 
because the force fields may not 
be well tailored to the molecules 
being studied, and because the 
experimental structure may not be 
the lowest-energy arrangement. 
It is also impossible to explore all 
conceivable structures, which are 
mind-boggling in number. 
Oganov and Glass use an 
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evolutionary algorithm to localize 
the search to the most promising 
structures. Their approach is 
attractive in that it requires no 
system-specific knowledge 

— the input is just the molecule’s 
chemical composition, not even 
its structure — and their ability 
to predict the unusual tetragonal 
structure of urea is impressive. 

Is this the long-awaited break- 
through in crystal engineering? 
Perhaps not, but surely it’s an 
important step forward. 


Discuss this paper at http/blogs. 
nature.com/nature/journalclub 
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NEWS 


Simple switch turns cells embryonic 


Research reported this week by three differ- 
ent groups shows that normal skin cells can be 
reprogrammed to an embryonic state in mice’. 
The race is now on to apply the surprisingly 
straightforward procedure to human cells. 

If researchers succeed, it will make it relatively 
easy to produce cells that seem indistinguishable 
from embryonic stem cells, and that are geneti- 
cally matched to individual patients. There are 
limits to how useful and safe these would be 
for therapeutic use in the near term, but they 
should quickly prove a boon in the lab. 

“Tt would change the way we see things quite 
dramatically,’ says Alan Trounson of Monash 
University in Victoria, Aus- 
tralia. Trounson wasn't involved 
in the new work but says he 
plans to start using the tech- 
nique “tomorrow”. “I can think 
of a dozen experiments right 
now — and they’re all good ones,” he says. 

In theory, embryonic stem cells can propa- 
gate themselves indefinitely and are able to 
become any type of cell in the body. But so far, 
the only way to obtain embryonic stem cells 
involves destroying an embryo, and to get a 
genetic match for a patient would mean, in 
effect, cloning that person — all of which raise 
difficult ethical questions. 

As well as having potential ethical difficul- 
ties, the ‘cloning’ procedure is technically dif- 
ficult. It involves obtaining unfertilized eggs, 
replacing their genetic material with that from 


“It's unbelievable, 
just amazing. It's like 
Dolly. It's that type of 
accomplishment." 


an adult cell and then forcing the cell to divide 
to create an early-stage embryo, from which 
the stem cells can be harvested. Those barriers 
may have now been broken down. 

“Neither eggs nor embryos are necessary. 
I’ve never worked with either,’ says Shinya 
Yamanaka of Kyoto University, who has pio- 
neered the new technique. 

Last year, Yamanaka introduced a system that 
uses mouse fibroblasts, a common cell type that 
can easily be harvested from skin, instead of 
eggs’. Four genes, which code for four specific 
proteins known as transcription factors, are 
transferred into the cells using retroviruses. The 
proteins trigger the expression 
of other genes that lead the cells 
to become pluripotent, mean- 
ing that they could potentially 
become any of the body’s cells. 
Yamanaka calls them induced 
pluripotent stem cells (iPS cells). “It’s easy. 
There's no trick, no magic,’ says Yamanaka. 

The results were met with amazement, along 
with a good dose of scepticism. Four factors 
seemed too simple. And although the cells had 
some characteristics of embryonic cells — they 
formed colonies, could propagate continuously 
and could form cancerous growths called tera- 
tomas — they lacked others. Introduction of 
iPS cells into a developing embryo, for exam- 
ple, did not produce a ‘chimaera’ — a mouse 
carrying a mix of DNA from both the original 
embryo and the iPS cells throughout its body. “T 


was not comfortable with the term ‘pluripotent’ 
last year,” says Hans Schdler, a stem-cell spe- 
cialist at the Max Planck Institute for Molecular 
Biomedicine in Minster who is not involved 
with any of the three articles. 

This week, Yamanaka presents a second gen- 
eration of iPS cells’, which pass all these tests. 
In addition, a group led by Rudolf Jaenisch” 
at the Whitehead Institute for Biomedical 
Research in Cambridge, Massachusetts, and a 
collaborative effort® between Konrad Hoched- 
linger of the Harvard Stem Cell Institute and 
Kathrin Plath of the University of California, 
Los Angeles, used the same four factors and got 
strikingly similar results. 

“It's a relief as some people questioned our 
results, especially after the Hwang scandal,” 
says Yamanaka, referring to the irreproduc- 
ible cloning work of Woo Suk Hwang, which 
turned out to be fraudulent. Schéler agrees: 
“Now we can be confident that this is some- 
thing worth building on” 

The improvement over last year’s results was 
simple. The four transcription factors used by 
Yamanaka reprogramme cells inconsistently 
and inefficiently, so that less than 0.1% of the 
million cells in a simple skin biopsy will be 
fully reprogrammed. The difficulty is isolating 
those in which reprogramming has been suc- 
cessful. Researchers do this by inserting a gene 
for antibiotic resistance that is activated only 
when proteins characteristic of stem cells are 
expressed. The cells can then be doused with 


Bush's climate plan ‘nothing new’ 


President George W. Bush's 
31 May announcement of a “new 
framework" for international 
efforts to confront climate change 
sounded, at first, like a sharp 
turnaround by the White House. 
But as analysts dissected his 
statements, many concluded that 
he had said little new. 

Ina speech in Washington DC to 
the Global Leadership Campaign, 
a group that lobbies for greater 
spending on international affairs, 
Bush called for the top-emitting 
countries to meet by the end of 
2008 to set a long-term global goal 
for greenhouse-gas emissions. The 
notion of Bush proposing a global 
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target to tackle climate change 
caused a flurry of excitement. 

But in a briefing afterwards, 
James Connaughton, the 
president's environmental adviser, 
said that Bush was referring only 
to “a long-term aspirational goal" 
rather than a binding commitment. 
“It is the implementing mechanisms 
that become binding,” he said. 

“It remains to be seen whether 
this initiative means anything,” 
says Bert Metz, a climatologist at 
the Netherlands Environmental 
Assessment Agency in Bilthoven. 
Stabilizing atmospheric levels of 
greenhouse gases, he says, requires 
“ambitious and urgent international 
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Leadership talk: George 
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Bush hints at setting global targets. 


J. SCOTT APPLEWHITE/AP 


LIGHT FOUNDATIONS 
Jupiter-sized planets in 
unexpected places spur 
debate. 
www.nature.com/news 


antibiotics, killing off the failures. 

The protein Yamanaka used as a marker for 
stem cells last year was not terribly good at 
identifying reprogrammed cells. This time, all 
three groups used two other protein markers 
— Nanog and Oct4 — to great effect. All three 
groups were able to produce chimaeric mice 
using iPS cells isolated in this way; and the 
mice passed iPS DNA on to their offspring. 

Jaenisch also used a special embryo to pro- 
duce fetuses whose cells were derived entirely 
from iPS cells. “Only the best embryonic stem 
cells can do this,’ he says. 


The birth of this chimaeric 
mouse suggests that the cells 
used to generate it behave like 
embryonic stem cells. 


“Tt’s unbelievable, just amazing,” says Schéler, 
who heard Jaenisch present his results at a 
meeting on 31 May in Bavaria. “For me it’s like 
Dolly [the first cloned mammal]. It’s that type 
of accomplishment.” 

The method is inviting. Whereas cloning 
with humans was limited by the number of 
available eggs and by a tricky technique that 
takes some six months to master, Yamanaka’s 
method can use the most basic cells and can be 
accomplished with simple lab techniques. 

But applying the method to human cells has 
yet to be successful. “We are working very hard 


— day and night,’ says Yamanaka. It will proba- 
bly require more transcription factors, he adds. 

If it works, researchers could produce iPS 
cells from patients with conditions such as 
Parkinson's disease or diabetes and observe the 
molecular changes in the cells as they develop. 
This disease in a dish’ would offer the chance 
to see how different environmental factors con- 
tribute to the condition, and to test the ability 
of drugs to check disease progression. 

But the iPS cells aren't perfect, and could 
not be used safely to make genetically matched 
cells for transplant in, for example, spinal-cord 
injuries. Yamanaka found that one of the fac- 
tors seems to contribute to cancer in 20% of his 
chimaeric mice. He thinks this can be fixed, 
but the retroviruses used may themselves also 
cause mutations and cancer. “This is really dan- 
gerous. We would never transplant these into a 
patient,” says Jaenisch. In his view, research into 
embryonic stem cells made by cloning remains 
“absolutely essential”. 

If the past year is anything to judge by, 
change will come quickly. “Tm not sure if it will 
be us, or Jaenisch, or someone else, but I expect 
some big success with humans in the next year,’ 
says Yamanaka. a 
David Cyranoski 
Additional reporting by Heidi Ledford 
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For more on alternative stem-cell work, see page 
649; and see www.nature.com/stemcells 


collaboration. Only starting the 
discussions on this next year sounds 


“| think that there is a good chance 
that whatever comes out of this 


at the same rate and under-funds 
research by a factor of 100. People 


China, another Asia-Pacific 
partner and the world's second 


rather strange.” 

Arriving before the annual G8 
meeting of the richest nations’ 
leaders in Germany this week, many 
have interpreted Bush's proposal 
as a tactical manoeuvre to lighten 
the pressure on him to agree to do 
anything firm about climate change 
at that meeting. Alden Meyer, a 
climate specialist at the Union of 
Concerned Scientists in Cambridge, 
Massachusetts, adds that it “could 
serve as a huge diversion” at the 
planned United Nations (UN) 
negotiations about climate change 
in Bali in December. 

Others argue that Bush's 
proposed framework may help 
rather than hinder progress. 


process will merge into the UN 
process,” says Jeff Holmstead, a 
former environment official with 
the Bush administration, now at the 
law firm Bracewell and Giuliani in 
Washington DC. 

Stephen Schneider, a 
climatologist at Stanford University 
in California, thinks that a side deal, 
separate from but not replacing 
the UN process, could in theory be 
helpful. But he says that the last 
such side deal initiated by Bush — 
the Asia-Pacific partnership of July 
2005, which heavily emphasized 
technology developments to 
address climate change — is 
widely regarded as a flop. “It lets 
greenhouse gases continue to rise 


who are cynical and view [the new 
proposal] as disingenuous have a 
long historical trail to base this on," 
he adds. 


“It remains to be seen 
whether this initiative 
means anything.” 


Japan and Australia, participants 
in the Asia-Pacific deal, have 
welcomed Bush's plan. Shinzo Abe, 
Japan's prime minister, has said he 
thinks the United States is “finally 
getting serious in dealing with 
global warming”. Last week, Abe 
launched Japan's plan for a non- 
binding arrangement to halve global 


greenhouse-gas emissions by 2050. 
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largest emitter of carbon dioxide 
after the United States, also 
announced its plan to tackle climate 
change this week. It intends to 

focus on improving environmental 
management and agricultural and 
energy efficiencies and, like the 
United States, boost research and 
development for alternative energy, 
but without compromising economic 
development. China notes that its 
per-capita emissions are lower than 
the world average, and much lower 
than those of the United States. The 
United States, it says, should take 
theleadinreducingemissions. 
Emma Marris and Colin Macilwain 
Additional reporting by Quirin 
Schiermeier and David Cyranoski 
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NASA explores scope for far-flung fix 


Researchers credit servicing missions involv- 
ing astronauts with rescuing the Hubble Space 
Telescope and keeping it alive for the past 17 
years. But an idea to create a similar, if simpler, 
capability for Hubble's successor is raising eye- 
brows among project scientists, who fear it will 
be impractical and expensive. 

The James Webb Space Telescope (JWST), 
currently scheduled for launch in 2013, will 
make infrared observations from a position in 
space 1.5 million kilometres from Earth. That 
puts it farther away than Hubble, which sits 
in low-Earth orbit just 600 kilometres away, 
within easy reach of the shuttle. So the JWST 
was designed assuming that it would fly without 
any servicing — by astronaut or robot — for its 
five-to-ten-year career. 

But Edward Weiler, 
head of NASA’s God- 
dard Space Flight Center 
in Greenbelt, Mary- 
land, which manages 
the JWST, has indicated 
that there may be scope 
for a manned mission to 
service or make simple 
repairs to the telescope. 

“Tt might be valuable if astronauts could fly 
to the JWST to do something that was a criti- 
cal, but easy fix,” says Weiler, “like opening a 
stuck antenna. Weiler, who used to be NASA's 
science chief, floated the idea of attaching a 


docking port to the telescope to allow a future 
mission to hook up. 

But the concept got a chilly reception last 
week at the American Astronomical Society's 
biannual meeting in Honolulu, Hawaii. When 


Out of reach? The James 
Webb Space Telescope is 
expected to run without 
being serviced. 


the JWST was being developed, it seemed Ps 
impossible that it would ever receive guests — 
it will be well beyond the reach of the shuttle. 
Only the decision to develop the Orion crew 
exploration vehicle, due to replace the shuttle 


DNA reveals how the chicken crossed the sea 


The discovery of chicken bones with 


bridge from Asia to Alaska. 


excavated at El Arenal, a site 


Natl Acad. Sci. USA doi:10.1073/ 


Polynesian DNA at an archaeological 
site in Chile has added hard, physical 
evidence to the controversial theory 
that ancient seafarers from the south 
Pacific visited the New World long 
before Columbus. 

When the Spanish conquistador 
Francisco Pizarro first visited Peru 
in 1532, he noted the importance 
of chickens in the daily lives and 
religious rituals of the Incas. But how 
the birds got there was a mystery. 
Chickens were first domesticated 
in Asia, and their absence from 
archaeological sites in the Americas 
indicates that they were not carried 
by migrating peoples over a land 
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One alternative theory — that 
Polynesians visited the Americas, 
bringing livestock with 
them and perhaps 
influencing cultural 
and technological 
development in the 
region — has long been disparaged 
by mainstream archaeologists, as 
it has largely been supported by 
supposition rather than evidence. 

So Alice Storey of the University 
of Auckland, New Zealand, was 
not particularly enthusiastic when 
acolleague in Chile asked her 
to sequence DNA froma trove 
of ancient chicken bones he had 


occupied between 700 and 1390 ap, 
to see if their origins could be traced 
to the Pacific islands. “I 
thought, ‘Well, we'll give it 
ago’," she says. 
Storey and her team 
reconstructed a 400- 
base-pair fragment of mitochondrial 
DNA from both the Chilean bones 
and chicken bones excavated on 
five archipelagos in Polynesia. 
Mitochondrial DNA doesn't mutate 
much and so is useful for tracing 
evolutionary lines. The Chilean 
sequences were identical to those 
from prehistoric sites in Tonga and 
Samoa (A. A. Storey et al. Proc. 
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pnas.0703993104; 2007). 
Radiocarbon analysis dated the 
bones to between 1304 and 1424 
AD, firmly before Europeans arrived 
on the east coast of South America 
in the 1500s. The same sequences 
are also present in the modern-day 
Araucana chicken, an odd Chilean 
breed that has tufted ‘ears’, lays 
blue eggs and lacks a tail. 

The study has left the research 
community cautiously optimistic 
that hard evidence for migration of 
Polynesians has been found. Jaime 
Gongora, a molecular geneticist at 
the University of Sydney, Australia, 
says the paper is a significant 


FATHERS OF THE ZODIAC 
TRACKED DOWN 
Astronomer shows when 
and where his ancient 
counterparts worked. 
www.nature.com/news 


R. D. FLAVIN 


after its retirement in 2010 and take astronauts 
back to the Moon, puts the telescope within 
reach. “If Orion is available, and we have a 
really simple, but significant problem on the 
JWST, wouldn't it make sense to ensure that 
astronauts could go to the JWST if they could 
fix it?” asks Weiler. 

But the harsh radiation environment in deep 
space would probably make it far too danger- 
ous for astronauts, says John Mather, the 
JWST’s chief project scientist at Goddard. And 
arobot mission could probably do very little. It 
might be able to give the satellite a good shake 
to loosen a stuck solar panel, says Mather, but 
would be unlikely to cope with more complex 
tasks. And in its repair efforts, it might dirty the 
telescope'’s outer mirrors. 

Mather says the JWST’s team is now con- 
ducting a feasibility study to find out whether 
a docking port could be added. But given that it 
is unlikely that a problem so simple it could be 
fixed by a robot will surface during the mission, 
Mather says he is not keen to add something to 
the already grossly over-budget telescope. “If 
it costs more than a few thousand dollars,” he 
says, ‘I'm not interested.” 

Repair missions to spacecraft closer to Earth 
than the JWST have so far been rare, but not 
unheard of. An orbiting mission to study the 
Sun was rescued by a service mission carried 
out by astronauts on the space shuttle Chal- 
lenger in 1984. The first servicing mission to 
Hubble in 1993 fixed a critical error, installing 
a corrective optics system to fix the telescope’s 
blurry vision. It has since been serviced a fur- 
ther three times. a 
Geoff Brumfiel 


Disgraced official was paid work bonus 


Further troubling reports have surfaced in 
the case of a disgraced US official accused 
of political interference in the workings of 
the Endangered Species Act. It has been 
disclosed that Julie MacDonald, former 
deputy assistant secretary for fish, wildlife 
and parks at the Department of the 
Interior (DOI), received a performance 
award of nearly $10,000 in 2005. Yet 
the report of an investigation into her 
conduct, released on 27 March this year, 
reveals that MacDonald violated federal 
regulations while in that position. She 
resigned on 1 May. 

The report, by the DOT’s office of 
inspector general, paints a portrait of 
a woman determined to minimize the 
Endangered Species Act’s effect on the 
economy. It includes evidence from 
colleagues that she heavily edited science 
reports from the field despite having no 
formal scientific training, and bullied and 
intimidated field scientists into producing 
documents along the lines she wanted. 

Observers say the case highlights how 
appointees of President George W. Bush 
can and have pushed political agendas 
within federal agencies. “She was a little bit 
more overt and transparent and shameless 
about her political antics and dealings, 
but she was nota lone ranger,’ says Jamie 
Rappaport Clark, executive vice-president 
of the environmental group Defenders of 
Wildlife in Washington DC and former 


director of the US Fish and Wildlife Service. 

MacDonald was also chastised for 
sharing “nonpublic information with 
private sector sources’, including a 
nonprofit lobby group called the California 
Farm Bureau Federation; the Pacific Legal 
Foundation, a law firm that represents 
development interests; and a friend from an 
online game. The report outlines how she 
sent internal departmental documents to 
a friend in the game World of Warcraft “to 
have another set of eyes give an unfiltered 
opinion of them”. MacDonald could not be 
reached for comment by Nature. 

The latest chapter comes from Steve 
Davies, editor of the newsletter Endangered 
Species & Wetlands Report. Davies learned 
through a Freedom of Information Act 
request that MacDonald received a Special 
Thanks for Achieving Results award of 
$9,628 in March 2005, during the period 
covered by the investigation. The DOI 
will not detail the reasons for the award; 
it says the justification is included in her 
performance evaluation, which is private. 

Meanwhile, Democrats in Congress 
are investigating MacDonald for her role 
in removing the Sacramento splittail 
fish from the endangered species list. 
MacDonald owns a farm ina floodplain 
that is a habitat for the fish, according 
to an investigation by the Contra Costa 
Times, a newspaper in California. ia 
Emma Marris 


DEREK SASAKI/WWW.MYPETCHICKEN.COM 


contribution to the field, but warns 
that the small fragments obtained 
from ancient DNA may tell only 
part of the story. The final verdict 
will require more extensive DNA 
data to make a full family tree of 


It's all relative: 
Chile's Araucana 
chicken shares 

its DNA with 
ancient birds from 
Polynesia. 


both modern and ancient breeds, 
he says. 

Archaeologist Terry Jones 
at California Polytechnic State 
University in San Luis Obispo, who 
has studied prehistoric Polynesian 


contact in the New World, is less 
circumspect. “It's essentially 
unequivocal evidence,” he says. 

Evidence of contact between the 
communities has been put forward 
in the past. In 1947, Thor Heyerdahl 
famously filmed his journey by raft 
from Peru across the Pacific to try to 
prove that South Americans could 
have settled the Pacific islands; 
although the theory was at odds 
with much of the evidence. 

More recently, Jones, along with 
Kathryn Klar at the University of 
California, Berkeley, has argued that 
the Polynesians introduced complex 
fish hooks and sewn plank canoes to 
the Chumash and Gabrielino Indians 
in southern California and the 
Mapuche Indians in Chile (K. A. Klar 
and T. L. Jones Am. Antiquity 70, 
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457-484; 2005). Others argue that 
Polynesians must have visited the 
tropical coast of South America in 
order to bring back the sweet potato 
and the bottle gourd. The voyage to 
South America is no more daunting 
than other trips Polynesians are 
known to have made. 

Even so, one of the co-authors 
on the chicken study, Atholl 
Anderson at the Australian 
National University, Canberra, 
is wary of overestimating the 
extent of this cultural diffusion 
without further study. Although 
the chickens provide hard evidence 
of transoceanic contact, the 
evidence that large-scale cultural 
exchange occurred remains largely 
circumstantial, he says. a 
Brendan Borrell 
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helix blazes trail for 
personal genomics. 


Genome miners rush to stake claims 


This February, Laura Scott, a genetic epide- 
miologist at the University of Michigan, Ann 
Arbor, spent her holiday sitting in a ski lodge 
in front of her computer, “very occasionally 
trying to go out and ski”. She was working 
on one of three papers that appeared online 
in Science in April, the same day that a com- 
peting group announced similar findings in 
Nature Genetics'*. All four papers identify 
genes implicated in adult-onset diabetes, one 
of western society's commonest ailments. And 
all four are part of a new genetics gold rush 
that uses such ‘genome-wide association stud- 
ies’ to flag disease-related genes. Hence Scott's 
indoor skiing trip: “The feeling was: ‘It’s gotta 
get out there?” she says. 

As in any gold rush, prospectors are pursu- 
ing the spoils as quickly as their tools, skills and 
finances will allow. This week, Nature publishes 
the biggest pot of gold yet: a report tagging 
more than 20 genetic markers associated with 
seven common diseases, from bipolar disorder 
to hypertension, teased out from the DNA of 
17,000 people (see page 661). 

Science is always competitive. But human 
genetics is going through a particularly intense 
burst of activity. The race is to identify single- 
letter DNA variations that are more frequent in 
patients with common, complex diseases and 
thus serve as markers for susceptibility. 

Technological advances that have come fully 
on-line only in the past 18 months are allowing 
geneticists to examine common diseases such 
as diabetes, which are caused by both environ- 
mental influences and by an unknown number 
of genes that contribute to different degrees. 
To find these small genetic influences, scien- 
tists must screen a thousand or more patients 
with each disease for hundreds 
of thousands of single-let- 
ter markers, and compare the 
results against similar screens 
of people without the disease. 

Such studies are still expen- 
sive — costing, at minimum, 
nearly a million dollars — and 
there are only a limited number 
of strong genetic associations to 
be found for the most common diseases. So the 
race is on to be first to pick and publish this 
low-hanging fruit, which many expect to be 
collected within a year or two. 

“If you've invested a large amount of money 
and a lot of time in doing one of these stu- 
dies, you don't want to publish your paper a 
month after someone else has found the same 
things,” says statistician Peter Donnelly of the 


Trail of blood: many genes influence the development of common diseases such as diabetes. 


University of Oxford, who is chairman of the 
Wellcome Trust Case Control Consortium and 
a lead author on today’s study. 

Many scientists, including Donnelly, wel- 
come the competition. A host of suspect genes 
published in the early days of the rush have 
turned out to be false positives. The way that 
results today are being confirmed through 
multiple, overlapping findings published 

near-simultaneously by com- 
peting groups “is fantastic for 
the field’, says Donnelly. 

In some cases, the prospec- 
tors are working together in 
the hope of improving their 
claims. The authors of the 
three papers on diabetes 
that appeared in Science, for 
instance, agreed months before 

publication to pool their initial data to improve 
their chances of identifying truly significant 
genetic variants. 

A working group at the US National Insti- 
tutes of Health (NIH) that today publishes a 
set of proposed standards for genome-wide 
association studies (see page 655) stresses 
the desirability of simultaneously publishing 
independent replication of results. It notes, 
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however, that some work may be so important 
as to justify its publication before it has been 
replicated. 

The results published by Donnelly and col- 
leagues today do not include independent 
replication. “The referees were unanimous 
that this advance was powerful enough as a 
landmark not to necessitate the conditions rec- 
ommended by the NIH,’ says Philip Campbell, 
Nature’s editor-in-chief. Indeed, the pace of 
the field means that several other groups have 
already confirmed and published some genetic 
links highlighted in this paper. 

Many of these findings will lead to genetic 
tests to identify those at greater risk of com- 
mon disease. But tests may be of limited use 
without an obvious intervention, such as statin 
drugs for those at increased risk of heart dis- 
ease. Patients must now wait for cell biologists 
to discover how the suspect genes do their 
damage — and for drug companies to exploit 
those findings. 

Meredith Wadman 
1. Diabetes Genetics Initiative of Broad Institute of Harvard 
and MIT, Lund University, and Novartis Institutes of 

BioMedical Research Science 316, 1331-1336 (2007). 

2. Zeggini, E. et al. Science 316, 1336-1341 (2007). 


3. Scott, L. etal. Science 316, 1341-1345 (2007). 
4. Steinthorsdottir, V. et al. Nature Genet. 39, 770-775 (2007). 
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¢¢] am not sure that 
it is fair to say that it 
is aproblem we must 


wrestle with.» 


NASA administrator Michael Griffin 
discusses climate change on US radio. 


“¢Global temperature 
is nearing the level 

of dangerous climate 
effects.» 


NASA scientist Jim Hansen and his 
colleagues express a rather different 
view in a recent publication. 


SCORECARD 


Meditation 
Mitch Altman's Brain 
Machine (pictured) 


claims to induce a state 

of deep calm by 

synching users’ 
brain activity to 
flashing LEDs 
and beeps. 


Concentration 
Researchers at 
University College 


London have developed a 
psychometric test to measure 
proneness to distraction. The 
test could help employers such as 
airlines that need staff able to... 
oh look, a chicken! 


ZOO NEWS 


Cock up 

Britain's Royal Society for the 
Protection of Birds was ridiculed 
last week when its software 
automatically removed the word 
‘cock’ froma forum posting about 
male blackbirds, replacing it with 
asterisks. Great tits (Parus major) 
are apparently still acceptable. 


NUMBER CRUNCH 


15 cmMisthe average length of an 
erect human penis, as determined 
by 11,531 measurements. 


12% of menina survey of 
50,000 believed that they had 
small penises. 


O% of men complaining of small 
penises in a similar study actually 
had a ‘micropenis’, defined as a 
flaccid length of less than 7 cm. 


Sources: NPR, Atmos. Chem. Phys., 
makezine.com, Psychol. Sci., Daily 
Telegraph, BJU Intl 


Diplomatic talks spur 
hope in Libya HIV case 


Diplomats are cautiously optimistic that a deal 
may be within reach, perhaps by the end of 
June, to save the lives of five Bulgarian nurses 
and a Palestinian doctor condemned to death 
in Libya for allegedly deliberately injecting 
over 400 children with HIV in 1998. 

Private negotiations have recently intensi- 
fied between Libya and the European Union — 
which Bulgaria joined on 1 January — to try to 
find a way out of the politically 
charged case. Any deal would 
have to balance provision of 
humanitarian aid for long- 
term treatment of the infected 
children, and support for their 
families, against compromising 
the medical workers’ defence 
with implied guilt. Islamic law allows for blood 
money to substitute for punishment. 

The medics were condemned to death on 
19 December 2006. Arrested in 1999, they 
were first found guilty and sentenced to death 
in May 2004, but the Libyan Supreme Court 
overturned the verdict and ordered a retrial. 
When that retrial also found them guilty, it 
sparked a worldwide political and public out- 
cry. Scientists argue that medical evidence 
exonerates the six, and that contaminated 
medical supplies and equipment caused the 
outbreak. This evidence was denied a hear- 
ing in court. The six have lodged an ultimate 


“The talks are going 
in the right direction. 
Let's say I'm less 
pessimistic than a few 
months ago.” 


appeal to the Supreme Court, but no date has 
been set for this. 

The case has seen many false starts, but 
diplomats are now cautiously optimistic that 
progress is being made. On 27 May, the medics 
were acquitted ofa separate but related case of 
slander, for accusing police of torturing them 
to extract confessions. 

Other political moves have been afoot. Tony 
Blair, Britain's outgoing prime 
minister, met with Libyan 
leader Muammar Gaddafi on 
a farewell trip to Africa last 
week. Blair's office said their 
discussion would include the 
medics’ case. In public, Blair 
announced strengthened coop- 
eration between the two countries — perhaps 
significant, as the HIV case has become an 
obstacle to Libya’s ongoing integration into 
the international community. 

After meeting with Blair, a representative 
for the infected childrens’ families indicated 
his openness to reaching a solution. At the 
same time, Libya’s foreign ministry issued a 
statement that the ongoing talks were intended 
“to find a solution favourable for all sides”. 

Nicolas Sarkozy, the new French president, 
made resolution of the case a foreign-policy 
priority in his victory speech last month. And 
George W. Bush, in an interview on Bulgarian 


Terror terms for arsonists 


Ten radical environmental 
activists have been sentenced 
over the past few weeks for a 
string of arsons committed in the 
late 1990s and early 2000s. The 
group, extremists claiming to be 
members of the Earth Liberation 
Front and Animal Liberation 
Front, targeted scientists and 
sites involved in activities such 
as logging and the culling of wild 


“ees 


Remains of the day: arsonists 
destroyed many years’ work. 


make their treatment in jail 
much more harsh and will label 
them for the rest of their lives. 
The judge in the case, Ann 
Aiken, ruled that crimes carried 
out with “intent or desire to 
influence, affect, or retaliate 
against government conduct” 
were eligible for these enhance- 
ments. She made it clear, 
though, that she was ruling only 


horses. As Nature went to press, 
most of the sentences had been 
handed out, and they ranged 
from 3 to 13 years. 

Lauren Regan, a lawyer 
working with the convicted 
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arsonists, says the sentences 
are “not surprising and within 
the realm of reasonable", but 
that ‘terrorism enhancements 
added to many sentences will 


on the narrow legal question of 
whether the crimes fit the legal 
criteria for the enhancement, 
and not on the broader, more 
controversial question of 
whether the people involved 
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Climate change and human 
intrusion converge to 
imperil birds. 


>? 


Tony Blair's meeting with Muammar Gaddafi in Libya last week may speed progress to a resolution. 


Television last week, reiterated the United 
States’s desire for the case to “be solved 
quickly and in a way that is satisfactory to 
the Bulgarian people.” 

Diplomats hope that the activity might 
result in a resolution before a summit of 
European Union heads of state in Vienna on 
21 June, just before Germany’s presidency of 


the European Union ends. 

“We are greatly appreciative of the very 
strong European diplomatic activity,” says 
Emmanuel Altit, a member of the medic’s 
defence team. “The talks are going in the 
right direction. Let’s say I’m less pessimistic 
than a few months ago.” 

Declan Butler 


should be labelled as terrorists. 


One of the biggest fires, and 
perhaps the most memorable 
to the scientific community, 
was the torching of a building 
in the Center for Urban 
Horticulture at the University 
of Washington in Seattle on 21 
May 2001. The fire targeted 
the work of Toby Bradshaw, 
whom the group thought was 
genetically engineering poplar 
trees. “lam delighted that 
the perpetrators have been 
caught,” says Bradshaw, “and 
satisfied that the criminal- 
justice system is capable of 
determining an appropriate 
punishment.” 

Group members were also 


found responsible for torching 
alab of the US Department 

of Agriculture's Animal and 
Plant Health Inspection 
Service (APHIS) in Olympia, 
Washington, on 21 June 1998. 
Lab worker Dale Nolte, who 
now works on avian flu for the 
service, says that he hasn't 
followed the trial and has no 
opinion on the sentencings. 
“My focus from the beginning 
was to recover our facilities, 
to keep up the morale of 

our scientists and keep the 
work going,” he says. One of 
this group was sentenced to 
more than 12 years in prison, 
which included a terrorism 
enhancement. 


The sentencing memo 
depicts a group of ideological 
activists who were not always 
successful at crime. Their 
cars broke down, accomplices 
dropped out at the last 
minute, members of the group 
were busted for shoplifting 
and time and again their 
incendiary devices failed to go 
off. Yet, according to federal 
prosecutors, they racked up 
more than US$40 million in 
damages between 1995 and 
2001. No one was harmed 
during the group's actions, 
although many contend that 
this was more through luck 
than careful planning. 

Emma Marris 
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Sibling rivalry hits 
Swiss institutes 


Tempers erupted last week at the Swiss Federal 
Institute of Technology Ziirich (ETHZ), with 
faculty members claiming that its board had 
sneakily siphoned its budget off to Switzerland’s 
other federal institute, the EPFL in Lausanne. 

Interim president of ETHZ Konrad Oster- 
walder has complained formally to the Swiss 
government, saying that the ETH Board — a 
politically independent body responsible for 
both universities and for four federal research 
institutes — “made serious errors in both the 
form and content of [its] decision on the allo- 
cation of the 2008 budget”. Department heads 
at ETHZ have also asked Pascal Couchepin, 
the government minister responsible for 
research and higher education, for his support 
in solving the crisis. 

As part of Switzerland’s push to bolster 
its research and higher education sector, the 
ETH Board’s budget for 2008 will be nearly 
4% higher than that for this year. The board 
decided to give a disproportionate sum to 
EPFL, even though there has been no politi- 
cal decision about how Switzerland might 
afford a second top-level university, say staff 
from ETHZ. The staff say that the board used 
different starting budgets to calculate the per- 
centage increase for each institute, and that 
it did not release the information within the 
required time before the meeting. 

ETHZ also hit the headlines last November 
when faculty members forced its president, 
Ernst Hafen, to resign. Hafen had tried to 
implement organizational changes at the uni- 
versity that had been desired by the board but 
that the faculty members thought were detri- 
mental to the institute. 

“The source of all the problems is the ETH 
Board,’ says Kathy Rifkin, spokeswoman for 
the Swiss parliamentary committee on sci- 
ence and research. She says that parliament is 
discussing the abolition of the board, to bring 
more decision-making back into the govern- 
ment — most particularly decisions about 
apportioning the budget. 

Alexander Zehnder, president of the ETH 
Board, says that he is surprised by the reac- 
tion. “The extra money given to Lausanne was 
not core money, but strategic funds used to 
integrate cancer research into that university 
plus some extra to reward the improvement in 
its research quality,’ he says. He adds that the 
board's procedures for budget allocation were 
transparent. The government has declined to 
comment on the dispute. 

Alison Abbott 
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Bush requests $30 billion 
to fight AIDS worldwide 


President George W. Bush is seeking to 
double US funding for fighting HIV and 
AIDS globally, requesting $30 billion to be 
spent over five years from October 2008. 
The money, which would be focused on 
prevention and treatment, should also 
benefit research projects in many of the 15 
countries targeted in the developing world. 

The three-year-old President’s Emergency 
Plan for AIDS Relief, funded by the 
Department of State, doesn’t support 
research directly. But the fact that it treats 
so many people — 1.1 million so far 
— “broadens the research questions that 
we can ask’, says Anthony Fauci, director 
of the US National Institute of Allergy and 
Infectious Diseases. For instance, scientists 
can examine the effect of antiretroviral 
treatment on the progress of other diseases, 
such as tuberculosis. 

Bush’s request, announced on 30 May, is 
expected to be approved by Congress. 


University union calls for 
academic boycott of Israel 


Representatives of Britain’s University and 
College Union (UCU) voted last week to 
call on its more than 120,000 members to 
consider a boycott of Israeli universities. 
The resolution, passed by 158 votes to 99, 
asks that union members “consider the 
moral implications of existing and proposed 
links with Israeli academic institutions”. 
The union was formed in 2006 in a merger 
of two of Britain’s leading associations of 
higher-education teachers. Both these 
associations had previously voted to boycott 
Israeli universities, and then rescinded or 
downgraded their resolutions. The latest 
vote drew prompt criticism, not only from 
outside but also from within the union. “I 
do not believe a boycott is supported by the 
majority of UCU members,’ says Sally Hunt, 
the union’s general secretary. 


Plane passengers asked 
to check in over TB risk 


The US Centers for Disease Control and 
Prevention (CDC) in Atlanta, Georgia, is to 
look at how one of its employees was involved 
ina case of extensively drug-resistant, or 
XDR, tuberculosis (TB) that triggered a 
health advisory to be issued last week. 

The CDC announced on 29 May that a 
man, who has XDR tuberculosis, had taken 
two transatlantic flights, and that it was 
seeking passengers who had travelled with 
him. Andrew Speaker had flown to Europe 
for his wedding and honeymoon after being 
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diagnosed with TB, but before learning that 
it was the XDR strain. He is now in hospital. 

On 2 June, the CDC said that it would 
review the role in the matter of the man’s 
new father-in-law — a CDC microbiologist 
who works in the Division of Tuberculosis 
Elimination. In a statement, the father-in- 
law, Robert Cooksey, said: “My son-in-law’s 
TB did not originate from myself or the 
CDC's labs, which operate under the highest 
levels of biosecurity”. 
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Tuberculosis bacteria that are resistant to nearly 
all drugs have sparked a health alert. 


Foundation reasserts 
claims in stem-cell patents 


A return shot has been fired over three key 
stem-cell patents that the US Patent and 
Trademark Office made moves to revoke 
in April. 

Critics of the patents, which cover 
primate embryonic stem cells and methods 
for making them, have claimed that the 
patents are overreaching and undeserved. 

On 31 May, the organization that 
administers the patents, the Wisconsin 
Alumni Research Foundation (WARB), filed 
its official response to the agency’s decision. 
In this, WARE says previous research was 
done in mice, which was difficult to translate 
into human cells, and that the patented 
work was therefore not “obvious” — a 
crucial yardstick for determining whether a 
research advance deserves a patent. 

Those who contested the patents now 
have a month to respond to WARF's filing, 
and a long appeals process could follow. 


Japan's health ministry 
calls for tests on Tamiflu 


Japan's health ministry has said that it will 
ask the Japanese distributor of Tamiflu to 
conduct tests on possible side effects of the 
anti-influenza drug. 

Reports of strange and sometimes 
suicidal behaviour in people taking Tamiflu 
have already prompted investigations in 
Japan (see Nature 446, 358-359; 2007). 

The country accounts for roughly 75% 
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Hidden papers revealed on neutron’s 75th anniversary 


The letter shown on the right was 
released by London's Royal Society 
on 1June, along with five packets 
of scientific papers that had been 
hidden away during the Second 
World War. The papers describe 
experiments on nuclear fission and 
provide details that could be used to 
build a nuclear reactor. 

“The outbreak of war marked 
the end of nuclear science being a 
collective investigation,” notes Keith 
Moore, the society's head of library 
and archives. 
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This letter accompanied one of the packets and is signed by the English physicist James 
Chadwick, who discovered the neutron in 1932. The papers were released to coincide with the 


75th anniversary of his discovery. 


of worldwide prescriptions of the drug, 
produced by Swiss firm Roche. 

The ministry will ask Chugai 
Pharmaceuticals, Roche's Japanese partner, 
to carry out five types of in vitro and 
animal test to examine how the drug is 
metabolized and transported in the brain. 
Chugai was waiting for an official order 
from the ministry as Nature went to press; 
a representative said the company plans to 
comply with its demands. 


Xtreme team takes the 
high road for blood tests 


A team of researchers and climbers, engaged 
in a study of how the body responds to 
low-oxygen conditions, has pushed to the 
summit of Mount Everest. 

The group didn't quite manage its goal 
of taking blood samples on the peak. “We 
decided that taking an arterial blood sample 
on the summit itself was too dangerous,” 
team member Dan Martin wrote in his 
online diary for Nature on 30 May (see 
http://tinyurl.com/2g3mxn). Instead, 


z < “MED, ap / 
Everest has played host to an experiment 
assessing how the body copes with low oxygen. 


members of the Caudwell Xtreme Everest 
expedition descended a few hundred metres 
to a spot where gloves could be removed 
more safely. This is still, they say, the highest 
altitude at which blood gas content has ever 
been examined. The team also collected 
data on oxygen use from subjects on an 
exercise bike at 8,000 metres. 

They hope the data will help inform the 
treatment of patients back at sea level who 
have difficulty shunting oxygen around the 
body. 


Indo-US nuclear talks 
stall for a fourth time 


Indian and US negotiators have once again 
failed to agree on a nuclear deal mooted in 
July 2005 (see Nature 436, 446-447; 2005). 

The deal promises India nuclear fuel and 
technology from the United States, in return 
for India opening its civilian reactors to 
inspection. But, like three earlier rounds of 
talks, the latest discussions ended in New 
Delhi on 2 June without a final agreement. 

Despite both sides claiming they have 
made some progress, sources in India’s 
Department of Atomic Energy told Nature 
that the talks have reached a stalemate that 
will only be resolved at the highest political 
level. India wants the right to reprocess spent 
fuel to recover plutonium and to retain the 
option of carrying out nuclear tests. 


NATURE CORRESPONDENT 

Nature is looking for a full-time reporter to join its 
office in Washington DC. This position requires 
flexibility in covering a range of areas from policy 
issues to developments in various research 
communities to major scientific discoveries. Key 
areas of responsibility include climate/energy/ 
environment issues and US science policy. To 
apply, send three to five clips, a résumé and 

a covering letter explaining your interest in 

and qualifications for the position, by 15 June 

to: admin@natureny.com. Please put ‘News 
Correspondent’ in the subject line. 
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Over a pork barrel 


The US Congress is well-known for tucking special provisions for favoured projects into budget bills. David 
Goldston explains why ‘earmarks’ for research and development have risen so dramatically in recent years. 


ington DC. Congress is beginning to write 

the spending bills for fiscal year 2008, with 
the hope of having the House of Representa- 
tives and the Senate each vote on their versions 
before the August recess. The process is likely 
to be even more politically charged than usual: 
the new Democratic leadership in Congress will 
be trying to show that the president stints on 
domestic priorities, while the president will aim 
to paint Congress as profligate. 

No doubt the age-old battle over congres- 
sional ‘earmarks’ will figure prominently in the 
effort to shape public perceptions of the budget. 
Indeed, this January the president called on 
Congress to halve the number of these earmarks, 
which provide money for a specific project or 
entity to help a constituent or friend. 

Interestingly, Congress itselfhas been feeling 
alittle queasy about earmarks of late. Although 
generally it sees earmarks as a fundamental 
prerogative, an egregious example occasionally 
makes headlines and cools the ardour for ‘pork- 
barrel spending’ (The name alone points to the 
nineteenth-century origins of the practice.) The 
most recent case was the $200-million ‘bridge 
to nowhere’ — a project, pushed through by 
then-chairman of the House Transportation 
and Infrastructure Committee Don Young 
(Republican, Alaska), to link a small city and 
an airport in Alaska that are now connected by 
ferry. Congress eventually rescinded the money 
— but not before the span gained mythical sta- 
tus as a symbol of wasteful spending. 

Along with other scandals, that incident 
led the new Congress to change the rules for 
earmarking as one of its first orders of busi- 
ness. Every earmark must now be publicly 
listed, along with the name of the legislator 
who sought it. 

But ‘transparency’ is an odd way to limit ear- 
marking. The whole point of earmarks is to get 
public credit for them — at least back home. 
The new rules might act as a brake on the 
total spending on earmarks, or on particularly 
embarrassing projects, but they haven't reduced 
the demand for pork. Members of the House 
have requested more than 31,000 earmarks for 
fiscal 2008, probably a record number. 

It is safe to assume that the number of ear- 
marks requested in research and development 
(R&D) programmes has grown apace. The 
American Association for the Advancement of 
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Science estimates that R&D earmarks grew from 
about $1.5 billion in fiscal 2002 to about $2.4 bil- 
lion in fiscal 2006. (Most of the earmarks were 
for colleges and universities.) That’s not a huge 
number in an R&D budget of more than $140 
billion, but it can put a noticeable dent in fund- 
ing for specific agencies, such as the National 
Oceanic and Atmospheric Administration, 
and programmes, such as the Department of 
Energy's hydrogen effort. Moreover, the rate of 
growth is a legitimate cause for concern. 

Why has ‘academic pork grown so rapidly? 
The most obvious reason is also the most often 
overlooked: more colleges and universities want 
it. Earmark requests almost always originate 
with the beneficiary, not with the representa- 
tives. As schools and faculty members have 
become more entrepreneurial, and as federal 
funding has come to be seen asa test of prestige, 
more schools have sought money — often to do 
the kinds of projects the government isn't oth- 
erwise funding or for programmes that aren't 
strong enough to win awards in traditional grant 
competitions. And once one school wins some 
federal cash from Congress, more want to play. 

And there are ever more people in Washing- 
ton DC who want to help them win. Lobbying 
is a growth industry, generally offering high 
salaries. It seems that no retiring congress- 
man goes back home anymore; they all stay in 
Washington and lobby — looking for clients, 
including research institutions, and encourag- 
ing them to seek federal funds. This structural 
change — a burgeoning ‘private sector’ that 
needs ever more lobbying clients to thrive — 
makes it hard to foresee any significant reversal 
in earmarking trends. 

Finally, Congress has bought into the notion 
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that R&D is the key to economic competitive- 
ness. So helping colleges and universities get 
some federal money is more enticing than it was 
when institutions of higher education seemed 
to have little relevance to larger political con- 
cerns. And economic development has always 
been a justification for pork-barrel projects. 

With all the factors pushing towards growth 
in academic pork, the real surprise is that there 
isn't even more of it. Fortunately, academic pork 
is still viewed as somewhat suspect, especially 
among the congressmen who most closely fol- 
low science issues. Pork is seen as a way to salve 
the injustices and inadequacies in the standard 
grant-making process, not as a sign that the 
overall system is in need of surgery. Notably, 
what are arguably the two most prestigious 
R&D agencies, the National Science Founda- 
tion (NSF) and the National Institutes of Health, 
have never been earmarked — perhaps both an 
effect and cause of their prestige. (Congress does 
sometimes push specific large construction 
projects at the NSE but those are first proposed 
by the agency, not Congress, and Congress does 
not choose where to locate them.) 

That doesn't mean that there hasn't been 
pressure to earmark the two agencies, especially 
from those who see peer review asa clubby sys- 
tem that benefits only the ‘haves’ In a happy 
coincidence, some of the ‘have nots, often 
schools from the south and Rocky Mountain 
west, are represented by conservatives who see 
any form of earmarking as a kind of budgetary 
incontinence. 

Also, Congress has tried to divert some of 
the earmarking pressure by setting aside com- 
petitive funds for institutions in states that do 
not get a large share of federal R&D funds. The 
programme began at the NSF decades ago, and 
Congress has gradually replicated it in other 
agencies. 

So for now, academic earmarks will probably 
continue to grow, but not without limit. Con- 
gress is likely to remain nervous through this 
political cycle that too much pork will smear 
it with a reputation for fiscal irresponsibility. 
And when it comes to academic grant making, 
Congress still tends to believe, to paraphrase 
Winston Churchill, that peer review is the 
worst system except for all the others. a 
David Goldston is a visiting lecturer at 
Princeton University's Woodrow Wilson 
School of Public and International Affairs. 
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Meeting in 
the middle 


Support for copycat versions of 
biotechnology drugs is growing 
quickly in the US Congress. 
Meredith Wadman reports. 


nan immaculate, glass-fronted complex 

in a suburb of Seattle, Washington, more 

than a hundred scientists at the biotechol- 
ogy company Amgen are busy cranking out a 
new generation of anticancer drugs, and honing 
manufacturing processes they hope will eventu- 
ally deliver treatments to millions of patients. 

And 3,000 miles away on Capitol Hill, con- 
gressman Jay Inslee (Democrat, Washington) 
— whose district is home to the Amgen facility 
— is busy with a project of his own: pushing a 
draft law that, he says, will enable the Amgens 
of the future to survive and prosper. 

In April, Inslee introduced legislation that 
would insulate inventors of biologics — com- 
plicated, large-molecule drugs — from generic 
competition for 14 years. The bill is intended 
to fend off stricter legislation that, Inslee says, 
could cripple the whole industry. “We can cre- 
ate a pathway to lower-cost copies of biotech 
drugs without eliminating incentives to create 
breakthrough medicines,” he says. 

Other lawmakers see it differently. Bills 
introduced by Henry Waxman (Democrat, 
California) in the House and Hillary Clinton 


Cancer drug Avastin can help patients such as Richard Lewis — but at a cost of up to $100,000 a year. 


we should do it? says Michael Werner, a former 
chief of policy at the Biotechnology Industry 
Organization (BIO) who now runs a consult- 
ing firm in Washington DC. 

It is not only Democrats who are pushing 
for the change. A bipartisan group of senators, 
including Ted Kennedy (Democrat, Massa- 
chusetts), Orrin Hatch (Republican, Utah) and 
Mike Enzi (Republican, Wyoming) will next 
week unveil compromise biogenerics legislation 
that they hope to bring to a Senate vote by July. 
Biogenerics are “an issue whose time has come’, 
says Craig Orfield, a spokesman for Enzi. 

Finding a bill that can pass into law will 
involve balancing the interests of the still-risky 
biotech industry with those of employers, insur- 


(Democrat, New York) inthe 1 Leadin gv oices ance companies and patients 
Senate in February would allow és complaining about paying tens 
biogenerics that are similar are talking more of thousands of dollars annually 
to brand-name biotech drugs agbouthowwedothis _ for biologics. 

— but not similar enough to than whether Without generic competi- 


infringe patents — to appear 
from the moment the original 
drugs hit the market. “This will 
lead to healthy competition and 
long-term savings for patients,’ says Waxman. 

The bills reflect a growing momentum to 
get generic versions of biologics to market 
in a Congress controlled by Democrats since 
January. With the first generation of patents 
on biologics now expiring, and increasing pub- 
lic disquiet about drug prices, lawmakers say 
they are determined to make generic biologics 
a reality by writing a law giving the US Food 
and Drug Administration (FDA) the explicit 
authority to approve them for market. 

In Congress, “the leading voices are talking 
more about how do we do this than whether 


we should do it.” 
— Michael Werner 


tion, the cost of biologics is 
unsustainable,” claims Missy 
Jenkins, a spokeswoman for 
the Coalition for a Competi- 
tive Pharmaceutical Market, an organization 
of employers and health insurers lobbying for 
action on biogenerics. 

The coalition, which backs Waxman’s bill, 
points to the price of drugs such as Avastin 
(bevacizumab), Genentech’s cancer drug, which 
costs up to US$100,000 for a year’s treatment. 

Those who would like to produce the copy- 
cat drugs, meanwhile, point out that the makers 
of standard, small-molecule pharmaceuticals 
have prospered, despite the Hatch- Waxman 
Act that opened the door to generic versions 
of their products in 1984. “For more than 20 
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years, generic medicines have been improv- 
ing lives,” says Kathleen Jaeger, the president 
of the Generic Pharmaceutical Association in 
Arlington, Virginia. 

The biotechnology industry, however, dis- 
putes claims that generic biologics could save 
billions of dollars in healthcare costs, arguing 
that the drugs they imitate operate in a limited, 
niche market. It also argues that biogenerics 
wouldnt offer new cures or treatments. 

Biotech firms are at pains to point out that 
their very business model will be put at risk 
— and patients will suffer — if Congress acts 
without considering the costs to innovator 
companies. “Wall Street will evaluate [the legis- 
lation’s] impact on the profitability of investing 
in biologics companies,” says Jim Greenwood, 
president of BIO. “If that’s likely to decline, it 
will reduce the amount of investment in these 
companies and we will have a commensurate 
reduction in new and spectacular products.” 

Among Greenwood’s chief complaints is 
that the strictest bills would provide innovator 
companies no guarantee of any period of mar- 
ket dominance. BIO is calling for 14 generic- 
free years for innovator biologics — the same 
period written into Inslee’s bill. Last year, the 
European Union settled on ten years as a suit- 
able period for biologics to be insulated from 
generic competition (see Nature Rev. Drug 
Disc. 5, 445; 2006). 

And it looks as though Congress will pass 
a law that involves a similar compromise. 
“If Waxman’s bill is the stake in the ground for 
the generic companies, Inslee’s bill is the stake 
in the ground for the innovator companies,” 
says Werner. “I think the final product will be 
somewhere in the middle” a 
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A beacon of reform 


Long a symbol of East German pride, the Charité medical school is flourishing in the 
twenty-first-century shake-up of German universities. Alison Abbott reports. 


he concrete high-rise of the historic 

Charité hospital was the pride of 

communist East Germany’s medical 

sciences. Built in 1982, its 21 stories 
were a riposte to the Verlagshochhaus — a 19- 
storey tower that the Springer publishing group 
built close to the wall in west Berlin, and that 
was seen as a way to taunt people in east Ber- 
lin with visions of western freedom, progress 
and wealth. The monolithic response from the 
other side of the wall was a showcase build- 
ing for a top research institute — one of the 
few institutes in the former republic that gave 
some scientists the freedom to travel abroad 
and provided a certain independence from the 
all-pervading communist ideology. The ‘Char- 
it sign on the top of the building could be read 
kilometres away — a proclamation that one of 
the city’s proudest and oldest scientific institu- 
tions stood tall in the east. 

Today, the Charité still holds its head up high. 
Last year it was ranked top university medical 
school in Germany by an independent research 
assessment, creeping ahead of schools in Hei- 
delberg and Munich. Much of its success is 
due to its having grabbed, with an alacrity not 
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shown by most other universities, the opportu- 
nities offered by recent government reforms. 

Founded in 1710 asa plague house outside 
the city gates, the Charité was later converted 
into a hospital, and developed close links with 
the University of Berlin (now Humboldt Uni- 
versity). It was the intellectual home of, for 
example, Rudolf Virchow, the father of modern 
pathology, and Emil von Behring, the bacteri- 
ologist who discovered the diphtheria toxin. 
The Charité officially became Berlin's univer- 
sity hospital in 1927. 

This was a time when Germany was the 
world leader in science and medicine; when 
it was unthinkable, for example, that anyone 
planning a research career would not learn 
German. But only a few years after the Char- 
ité won its university hospital status, the Nazis 
came to power, and by the end of the Second 
World War the picture had changed entirely. 
Physical and moral destruction had left the 
country’s scientific landscape in ruins. The 
Charité’s historic buildings had to be rebuilt 
brick by brick from the rubble. Reading Ger- 
man would never again be compulsory for sci- 
entists from Chicago to Shanghai. 
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On both sides of the wall, German science 
eventually recovered from the war and from 
the subsequent shock of reunification. Today 
the country’s scientific impact is among the 
world’s top five, thanks in large part to the 
network of Max Planck Institutes that serves 
as home to the majority of Germany’s most- 
cited scientists. But the country’s politicians 
think that things would be even better if the 
university sector, too, were to house simi- 
lar excellence. They are aware that Germany 
boasts no heavyweights to rival the giants of 
Cambridge in Massachusetts — or, closer to 
home, of London or Paris. Although Germany 
spends more on research than either France 
or the United Kingdom, Heidelberg, Munich 
and Berlin do not shine in the world rankings. 
German universities rarely feature in lists of the 
top 50 worldwide. 

The shackles holding back the universities, 
critics say, were forged from a mixture of tradi- 
tional stodginess and utopian zeal. In the post- 
war years, the strong hierarchy within German 
universities meant, among other things, that 
young scientists were rarely able to run their 
own research. Herr Professor — or the very 
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occasional Frau Professorin — made all the 
decisions, applied for all the research grants 
and received guaranteed levels of research 
money from the university, no matter how 
productive he, or she, happened to be. 

The spirit of 68 and the political awaken- 
ing it brought to Europe's students dovetailed 
with the postwar generation’s concern that 
many of those in charge of the universities 
had served the Nazi regime. The result was a 
drive to democratize academia. For example, 
student, administrative and technical rep- 
resentatives were included on all university 
committees, including the selection boards 
for faculty appointments. Decisions became 
painfully slow. The university president had 
very few powers, and it was never clear who 
was responsible for anything. “It is really time 
to say goodbye to this collective irresponsi- 
bility,” says physicist Jiirgen Mlynek, former 
president of Humboldt University, where the 
professors have a majority of just one in the 
highest academic committee, the senate. 


All things being equal 
In the 1970s, a series of developments 
enshrined the concept of the ‘equality of uni- 
versities’ in law. A degree from one university 
was supposed to be as good as one from any 
other university and the law prevented univer- 
sities from selecting their own students and 
from charging for tuition. The roughly equal 
funding of the roughly equal universities pro- 
vided adequate education for all but it didn’t 
promote competition or innovation. 

And some of the biological sciences suffered 
other political constraints. The idea of genetic 


Flexibility at the Charité 


With her PhD and medical 
qualifications, Seija Lehnhardt 
(pictured) could have carried 
on working at Harvard 
Medical School. There she 
would have found a way to 
divide her week sensibly 
between the clinic and 
research. Most institutes in 
Germany don't have such 
flexibility. Clinicians work 
under rigid employment 
conditions that mean that 
lab work has to be relegated 
to evenings and days off —a 
system that some say explains 
why so much of the clinical 
research in the country is poor. 
But Lehnhardt wanted 
to return to her family and 
friends in Germany, while 


continuing her research and 
qualifying as a neurologist. 
The answer was to be found 
at the Charité University 
Hospital in Berlin (see main 
text), where a flexible attitude 
to financing and the support 
of her clinical chief allows 
her to work part time on an 
unorthodox contract. 

The Charité has set aside 
funds to promote the careers 


Former research minister Edelgard Bulmahn 
helped remove barriers to innovation and reform. 


engineering was mostly rejected — an under- 
standable reaction to the appalling abuses of 
Josef Mengele and his cohorts — and that held 
back the development of molecular biology. 
The first German factory for producing geneti- 
cally engineered human insulin was supposed 
to open in the early 1980s, but was famously 
delayed for nearly a decade. Although the regu- 
lations for genetic engineering are now in line 
with those elsewhere, research on embryonic 
stem cells remains among the most restricted 
of European countries. 

Meanwhile in eastern Germany, research 
had been more or less stamped out of most uni- 
versities. The freedom of thought it required 


of female scientists. A 
Rehel-Hirsch fellowship pays 
Lehnhardt a full-time salary 
as a researcher and provides 
support for her small research 
group, which works on the 
brain’s immune system. That 
allows her to work ‘unpaid’ in 
the Charité’s neurology clinics 
for three days a week — work 
that will allow her to qualify in 
her speciality. 

She sees herself as a 
fortunate exception ina 
system that makes it hard 
for clinician-researchers to 
do either job well. “In Boston 
| knew a lot of happy people 
doing clinics and research — | 
don't see a lot of happy people 
in this situation here.” AA. 
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was not seen as sitting comfortably with the 
teaching of young minds. Most research took 
place in the Soviet-style institutes of the Ger- 
man Academy of Sciences. The Charité was 
one of the few institutions spared. 

In the 1990s, reunification, with its bank- 
rupting costs, forced the restructuring of East 
German institutes. The upheaval also provided 
an opportunity to take a closer look at the stag- 
nating system in the west, and in particular at 
the perceived loss of the best young researchers 
to the United States. The result was a series of 
schemes to enliven the universities, for example 
by cutting the time it took students to graduate 
and by increasing competition between institu- 
tions. But the effect of federal initiatives is lim- 
ited by the 16 state governments that hold sway 
over Germany’s 100 or so research universities. 

In 2001, the research minister at the time, 
Edelgard Bulmahn, pushed through legisla- 
tion aimed at removing the obstacles to inno- 
vation and reform that had been put in place 
by some of the state governments. Among its 
provisions was a new salary system for profes- 
sors that allowed performance-related pay. 
Another important reform was the creation of 
fixed-term ‘junior professorships also paid for 
by the federal purse. These independent posi- 
tions have allowed 1,145 young academics to 
set up their own research groups at a university. 
At the same time the Habilitation qualification 
required for university teachers stopped being 
mandatory, removing years from potential 
qualifying times. 


Competitive streak 

Since 2001, the federal government has suc- 
ceeded in forcing research organizations over 
which it has more direct control, such as the 
Helmholtz Association (see “Uncoiling the 
dead hand; page 633) to become more com- 
petitive. But despite the relaxation of rules 
on salaries, Habilitation and hiring of young 
academics, the universities were not obliged 
to change their ways, and at first most didn't 
— even when some state governments, such 
as those of Bavaria, Baden Wiirttemberg and 
North Rhine-Westphalia, joined the reform 
bandwagon and passed local laws that encour- 
aged their universities to modernize and to use 
their budgets flexibly. 

The Charité, by contrast, has been one of the 
few universities to take up all opportunities for 
reform without hesitation. Unlike many of its 
counterparts in western Germany, the medical 
school was never burdened by inertia. Quite the 
opposite — it had been turned upside down and 
inside out by reunification. Alongside major 
restructuring, the Charité staff had their polliti- 
cal backgrounds examined: those found to have 
collaborated with the Stasi to the detriment of 
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their colleagues were dismissed. Many 
others were dismissed simply because of 
massive overstaffing, particularly among 
the technical-support staff. 


Joining forces 

Staff morale might have been low, but 
funding prospects were better at the 
Charité than at the two medical schools 
of the Free University in west Berlin, 
which lost their lavish federal subsi- 
dies in the cutbacks after reunification. 
In 2003, the cash-strapped Berlin state 
government decided to merge the city’s 
medical schools under the umbrella of 
the Charité, making it a university in its 
own right, but demanding an additional 
33% budget cut to be phased in by 2010. 

The fresh if traumatic start forced on the 
Charité made it easier to make the neces- 
sary changes. Commercial activities, such 
as clinical-trial services, add 10% to the 
€200 million (US$270 million) budget it 
gets from Berlin's state government. And it 
is pulling in enough grant money to make 
up for the reduction in state funds. “The 
early years after reunification were psy- 
chologically difficult on both sides,’ recalls 
Detlev Ganten, head of the merged medi- 
cal schools. “But I’m optimistic now — we 
are really starting to identify with the scientific 
spirit in Berlin before the Nazi period” 

To weaken the rigid academic hierarchy, 
whenever any academic staff left the Charité, 
Ganten pooled their institutional money and 
support staff and used the shared resources 
competitively, for example to offer good starting 
packages for young academic staff, to improve 
career opportunities for women and to sup- 


The Emmy awards 


Detlev Ganten 

is head of the 
expanded Charité 
medical school. 


port top performing staff (see ‘Flexibility at the 
Charité’, page 631). Already a third of faculty 
resources are shared out in these ways, and the 
beneficiaries have to reapply every year for their 
share. The pool will grow further at the end of 
this year when all packages agreed over the dec- 
ades will be cancelled and renegotiated. 
Today, the Charité hosts 25 junior profes- 
sorships, the highest number of any medical 


The Emmy Noether awards 
are run by Germany's main 
funding agency, the DFG. 
They aim to encourage young 
German scientists working 
abroad to come home. 

This spring, Nature 
surveyed this year's 77 
award recipients about 
their attitudes to Germany 
as a place to do research. 

Of the 55 that responded, 
more than half described 
Germany as being either as 
attractive or more attractive 
to young scientists as 

the countries they had 

left. “I find conditions 
surprisingly good,” says 
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biomathematican Korbinian 
Strimmer, who spent two 
years as a postdoc in Oxford, 
UK, and now works at the 
University of Leipzig in 
Germany. “Right now the 
intellectual climate is better 
in the United Kingdom, 
but Leipzig is getting very 
interesting, and the critical 
mass is now arriving.” 
However, more than 
half the recipients judged 
university hierarchies to be 
more intrusive in Germany 
than in the country they had 
left, and more than a third 
said that they thought the 
Habilitation qualification 


was still important for their 
career in Germany. Many 
complained that the effect 
of performance-related pay 
has been to lower salaries, 
because the basic rate 
has been reduced and the 
universities are tight-fisted 
with bonuses. 

“In practice, this reform 
has served only to cut costs 
— itis a battle for scientists 
to get the same salary they 
had before,” says 38-year-old 
Joachim Hermisson, who 
leads an Emmy Noether group 
in population genetics at the 
Ludwig Maximilians University 
of Munich. AA.&S.S. 
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school. Unlike some other institutions, it 
has appointed to those positions people 
whom it truly intends to see tenured at the 
end of the process. Elsewhere, the reforms 
have been less impressive. By 2004, only 
two-thirds of the junior professors had 
gone on to get tenured positions, fewer 
than had been anticipated. Most univer- 
sities didn’t want to offer the few open 
faculty positions they had to junior pro- 
fessors (see “The Emmy awards’). 


Attractive incentives 

In 2005, the federal government adopted 
new tactics and decided it could best per- 
suade universities to adopt reforms by 
dangling carrots — in the form of compe- 
titions that could be won only by universi- 
ties able to attract the best students and 
faculty and to network effectively with 
their neighbours. The most influential of 
these competitions is the massive three- 
part Excellence Initiative — €1.9-billion 
over 5 years. For comparison, the coun- 
try’s science-funding agency, the DFG, 
has an annual budget of €1.3 billion. 

The initiative is now halfway through. 
Eighteen graduate schools and seven- 
teen ‘clusters of excellence’ have been 
rewarded in the first funding round. 
The initiative can name up to five ‘lite uni- 
versities’ from among the winners of the other 
categories. In the first round, three universi- 
ties — two in Munich and one in Karlsruhe 
— earned the élite label. This should mark 
the end of the pervasive myth that all German 
universities are equal. “The competition has 
challenged the dogma of equality, making the 
difference between universities apparent,’ says 
Peter Strohschneider, chair of the German Sci- 
ence Council in Cologne. “A new paradigm in 
Germany science policy has been established 
and it has far-reaching effects for the whole 
university system.” 

“The competition's really put new momen- 
tum into all the universities,” says DFG presi- 
dent Matthias Kleiner, whose agency is helping 
to administer the initiative. “For one thing it 
got people interacting — faculties had previ- 
ously behaved like little kingdoms but they had 
to cooperate for the Excellence Initiative.” 

The Charité is already reaping the benefits 
of the more competitive structure it has cre- 
ated. It won an Excellence Initiative award 
for its graduate school in neuroscience — the 
Berlin School of Mind and Brain — worth 
€1million a year for five years, and has been 
short-listed in the second round, to be decided 
this autumn, for a neuroscience research clus- 
ter worth around €6.5 million per year for five 
years. This success adds to a prestigious award 
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it won from the federal research ministry last 
year to create the Berlin-Brandenburg Center 
for Regenerative Therapies, which comprises 
23 research groups and is funded with €45 
million over four years. Also last year it was 
ranked top German medical school in impact 
and grant money by the independent Bertels- 
mann Foundation in Giitersloh. 


Slowly but surely 

Some of the older Charité professors were 
not happy with the rapid pace of change, says 
Ganten. But the young faculty members are. 
“Td be a liar if I said the atmosphere here was 
quite the same as San Francisco and Berkeley 
but there is already a huge difference compared 
with when I studied here in the mid-1990s,” 
says Charité neuroscientist Dietmar Schmitz. 
“Tt’s getting there slowly.” 

Schmitz returned to Germany in 2002 with 
both a junior professorship and an Emmy 
Noethar award, designed to attract back emi- 
gré scientists, and now has tenure and coor- 
dinates the Charité neuroscience cluster. Two 
main things attracted him back from the 
United States. “The Charité offer was tenure 
track with a good package,” he says. And he 
was aware of a growing neuroscience buzz 
around the city and its academic community. 
The city had chosen neuroscience as a focus 
and built up appropriate infrastructure. Top 
neuroscientists had already started to arrive 
there from elsewhere. 


“German universities are getting more 
professional,” observes Christian Spahn, a 
structural biologist with tenure who also 
joined Charité as a junior professor in 2002 
— despite the offer of a faculty position in 
the United States. He is now being courted by 
other German universities and knows he will 
get the facilities he needs to match his research 
ambitions. But Spahn doesn't think that the 
changes in Germany go far enough in trying 
to improve competition. “What we really need 
now is for funding agencies to introduce [pay- 
ment to cover] overheads,” he says, “so univer- 
sities know that if they employ a good scientist 
he or she will bring in regular money — more 
regular than the occasional competition like 
the Excellence Initiative.” 


Uncoiling the dead hand 


Germany's Helmholtz 
Association runs 15 large national 
research centres addressing a 
range of government priorities. 
The association has acquired a 
reputation for stodginess and 
civil-servant mentality, and with 
federal money providing 90% 

of its income, it was a prime 
target for Germany's reforms. 

A bill introduced in 2001 (see 
main story) aimed to make 

the research centres more 
competitive and responsive to 
changing government strategies, 
and the association's drive to 
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Cosmopolitan 
Berlin: Charité 
researcher 
Christian Spahn 
discusses his work 
with Venezuelan 
colleague 
Francisco Triana- 
Alonso. 


As things gradually improve for Ger- 
man universities, the Charité is planning to 
upgrade its concrete high-rise to celebrate 
its rise in status. An €86-million restoration 
project has been launched to give it a new 
facade and seven additional floors. The rea- 
sons are practical — the medical school needs 
more space. But the extension and expansion 
are not without their own propaganda pur- 
pose. The newly heightened building will 
go by the name of the Leuchtturm der Leb- 
enswissenschaften Berlin — the Beacon of 
Life Sciences. a 
Alison Abbott is Nature's senior European 
correspondent. Additional reporting by 
Sophie Stiegler. 

See Editorial, page 613. 


engineering he thinks is much 
needed. For example, centres 
that appoint women to top 
positions get a million-euro top- 
up, and the fund also finances 
up to 100 tenure-track junior 
research groups. 

The president's fund also 
supports, with €10 million over 
5 years, a suite of ‘Helmholtz 
alliances’ between centres and 
universities. The ‘Physics on 
the Terascale’ alliance includes 
the particle-physics centre 
DESY in Hamburg, the Research 
Centre Karlsruhe and 17 German 
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implement these changes has 
been ruthless. 

Then-president Jurgen Mlynek 
changed the system from a 
guaranteed budget for every 
centre to one in which the entire 
€1.6-billion (US$2.2 billion) 
annual budget is pooled into 
strategic areas from which 


scientists must apply for support. 
The strategic value and quality 

of the research proposals are 
evaluated by panels of experts, 
half of whom are foreign. Less 
than 5% of the allocations have 
shifted since the practice was 
introduced in 2001, but some 
research areas have received a 


big boost. Financing of energy 
efficiency, for instance, has 
increased by 15% and the budget 
for photovoltaics has doubled. 
Mlynek has partitioned off an 
annual €25 million, rising to €60 
million next year, for a president's 
initiative fund, which allows 
him to indulge in some social 
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universities. Its strategic aim is to 
keep the German particle-physics 
community, about to lose the 
country’s main particle accelerator 
HERA at the DESY lab (pictured), 
in good enough shape to exploit 
the Large Hadron Collider at CERN 
in Geneva when it starts pumping 
out the particles in 2008. AA. 
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Peaceful primates, v 


— > 


iolent acts 


Brought up in the Congo basin, Jonas Eriksson has worked through a war and battled poachers to 
help reveal the secrets of bonobo societies. Carl Gierstorfer reports. 


n 1998 in Lomako, a study site in the 
northwestern Equateur province of the 
Democratic Republic of Congo, a peace- 
loving primate closely related to the 
chimpanzee showed its darker side. A group 
of bonobos (Pan paniscus) was feeding when 
a male started to act aggressively towards a 
female with an infant — an unwelcome act 
in the typically female-dominated primates. 
Suddenly, all hell broke loose. The females 
banded together to attack the male, and beat 
him viciously for more than a half hour. The 
other males fled, and the wounded aggressor 
disappeared, never to be seen again. 

The event epitomizes a paradox in bonobo 
societies. DNA studies' done at the site have 
shown that the females aren't related, so coop- 
eration would not benefit their kin directly. 
So why would females cooperate to exclude 
aggressive males? That is one thing that Gott- 
fried Hohmann and Barbara Fruth from the 
Max Planck Institute (MPI) for Evolutionary 
Anthropology in Leipzig, Germany, had been 
studying at the Lomako site for eight years 
before the thrashing. But soon after the inci- 
dent, violent raids from a different primate — 
human rebels from nearby Rwanda — evolved 
into a full-blown war that eventually reached 
Lomako and forced the researchers to leave. 


A year before the event, Jonas Eriksson 
(pictured above), a former graduate student 
at the University of Uppsala in Sweden had 
joined the research team. The son of Swed- 
ish Baptist missionaries, Eriksson had spent 
his childhood in the pristine forests of the 
Salonga National Park in the 
central Congo basin and had 
gained a detailed knowledge 
of the region. While working 
on his degree, he learned about 
primate behaviour and field 
studies. The softly spoken 38- 
year-old says that he thought 
of his childhood hunting trips 
with bow and poison arrow 
and knew he could contribute something 
to the field. He was to prove instrumental in 
keeping the research going during the crisis. 

In 2000, Hohmann and Eriksson set out on 
a trip worthy of Henry Morton Stanley's epic 
exploration of the Congo basin in the 1870s. 
They combed the better part of the bonobo’s 
range — around 200,000 square kilometres 
— by foot and bicycle, hunting for bonobo 
faeces, scooping them from the forest floor, 
sealing them in plastic bags and sending them 
to Leipzig to sequence their DNA. Although a 
dirty job, this way of collecting DNA samples 
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puts as little stress on the bonobos as pos- 
sible. Their analysis of 34 males from four 
distinct sites” showed that males from the 
same site had more similar Y chromosomes 
than did those from different sites, indicat- 
ing that related males stay together, as they 
do in chimp societies. But 
mitochondrial DNA from 
these males, which is inherited 
down the female line, did not 
show such clustering, indicat- 
ing that females tend to leave 
the group. Combined with 
their observation that females 
will work together to maintain 
their dominant status within 
their society, these findings further chal- 
lenged the idea that genetic relatedness plays 
any part in female cooperation. 

Brenda Bradley, an evolutionary geneticist 
at the University of Cambridge, UK, says that 
many researchers realized that they were “over- 
estimating genetic relatedness when they see 
cooperation.” Eriksson and colleagues’ work 
helped to clarify that issue by providing data 
on long-range gene flow in the apes, she says. 

Chimpanzees (Pan troglodytes) have a simi- 
lar kinship pattern but behave differently. Like 
the bonobos, female chimps in a group are 
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generally unrelated. But unlike the bonobos, 
chimp societies tend to be dominated by the 
males. Whereas violent encounters are the 
norm in the chimp society, conflicts such as 
that observed at Lomako are rare in bonobos. 
Perturbations to bonobos’ social order are 
generally defused through sexual acts, often 
in homoerotic encounters between females. 


Secret for success 

Eriksson and Hohmann had been hunting 
for more than just bonobo droppings on their 
trek. They had also been looking for a new 
study site and settled on the southern reaches 
of Salonga National Park. Eriksson’s mastery 
of the Congolese language and culture were 
integral to securing permission from villag- 
ers to use the site. “He has a strong emotional 
attachment to Congo and the Congolese peo- 
ple,” Hohmann says. 

Fruth says that she admires Eriksson’s abil- 
ity to penetrate the Congolese culture. But his 
intimate link also has its downsides: Fruth says 
that Eriksson's ‘Congolese’ way of approach- 
ing things means that he refuses the pace of 
the western world and prefers a more laid- 
back lifestyle. “He has to be pushed to bring 
things to an end,” he says. Nevertheless, the 
team managed to secure the study site in 2000, 
and work could resume. For Hohmann, Fruth 
and Eriksson, a new opportunity to explore 
the bonobo paradox began to take shape. 

The researchers think that the coop- 
peration between unrelated females to 
keep aggressive males in check was to pro- 
tect against infanticide, which is common 
in male chimps — bonobos closest rela- 
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Bonobos cooperate more than genetics predicts. 


tives. Moreover, the females may pool their 
efforts to collect high-value resources such 
as meat. Hohmann and Fruth have found 
that at Salonga, meat consumption is much 
more pronounced than previously thought 
in the normally fruit-eating apes. The prey is 
caught by females, possibly even in groups, 
and males rarely share in their spoils — a 
striking contrast to chimpanzees. “They are 
just sitting there, begging for meat, or even 
guarding the kids only to score well with the 
females,’ Fruth says. 

The lack of male aggression could be down to 
the plentiful supply of good-quality resources. 
Meat might be a delicacy enjoyed only by the 


The bush-meat 
trade is starting 
to threaten the 
bonobo study 
site. 
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females, but fruit is abundant and sex is readily 
available, reducing the need for competition. 
But while Eriksson was in Leipzig 
sequencing the bonobo droppings, a new 
problem erupted. The bitter war that shook 
the country and cost an estimated four million 
human lives had ended, but leftover weapons 
were being put to use in the bush-meat trade. 
“Suddenly, in 2005, I got these reports from 
my friends in Congo that the poachers were 
coming closer and closer to that area that’s 
really fond to me,’ says Eriksson. For more than 
two years poachers had been moving steadily 
into the Salonga National Park, mainly target- 
ing the abundant and easy to kill red colobus 
monkey. “They pick them off like fruit,’ Eriks- 
son says. As colobus numbers dwindle, the 
bonobos are more likely to be targeted. 


Trading places 
So, with support from his mentors, Eriksson 
abandoned his research to protect the site. He 
convinced local park rangers and villagers to 
help him chase out the poachers, armed with 
automatic weapons. “I think the combination 
of being foreign, white-skinned, but speaking 
to them in a way that penetrates their culture 
and language is the key,” Eriksson says. His 
approach has been effective in keeping the 
poachers out of the study site, at least for now. 
Having put down his pipette for an AK-47, 
Eriksson says that he’s determined to return to 
science, but not necessarily in the same role. 
“T probably won't spend too much more time 
in a lab; it’s a waste of time. There are other 
people who are much more skilled than me.” 
Hohmann chides that Eriksson's “academic 
ambitions are easily outrun by his liking for 
adventures”. Nevertheless, Salonga is still in 
danger and the conflict is bound to escalate as 
the poachers take greater risks. Eriksson says 
that he has already received death threats. 
Having seen their Lomako site collapse, the 
team is determined to hold on to the one in 
Salonga. Too many questions remain about 
how bonobos manage to avoid violent con- 
flicts. Ironically, saving the peaceful bonobos 
from the poachers may require more aggres- 
sive displays. Eriksson says: “I did not spend 
years studying to run around in the forest with 
a Kalashnikov and my finger on the trigger. 
But emotionally, it is very easy to convince 
myself that these steps are necessary. I have 
to try to do something.” 
Carl Gierstorfer is a freelance writer in Berlin. 
To see a video of Jonas Eriksson discussing his 
work, see http://tinyurl.com/ywnv47. 
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Those who are crossing 
boundaries need less talk, 
more help and flexibility 


SIR — Interdisciplinary, cross-disciplinary, 
multidisciplinary and transdisciplinary 
research are increasingly perceived to be at 
the frontier of science. But as Adina Payton 
and Mary Lou Zoback point out in Recruiters 
(‘Crossing boundaries, hitting barriers’ 
Nature 445, 950; 2007), it is not clear how 

the scientific community can gain from 

their evolution. 

Despite a shift towards an interdisciplinary 
research culture, we are yet to grapple 
with how to support a growing number 
of interdisciplinary researchers. As 
interdisciplinary postgraduate research 
students, we face this reality head-on. 

We have found it difficult to synthesize 
the separate perspectives of two or more 
disciplines into a meaningful middle ground. 
Unless the scientific community identifies 
strategies for supporting interdisciplinary 
researchers to negotiate this middle ground, 
little progress can be made. Here we suggest 
two useful approaches. 

First, interdisciplinary researchers are 
expected to develop a different skill set from 
that of their single-discipline colleagues. In 
this ‘interlocker’ role, they engage in a shared 
conversation between disciplines and work 
through the tensions this creates. This is 
more than simply negotiating the different 
languages and ways of working — it is about 
appreciating a breadth of knowledge in theory, 
approach and discourse. 

Unfortunately, few systems accommodate 
this type of researcher — as is sadly 
demonstrated by emerging frameworks 
designed to assess research quality in New 
Zealand, the United Kingdom and Australia. 
Interdisciplinary committees are needed to 
assess research proposals, to review grant 
applications and to examine theses. This 
would be more effective than the current 
practice of putting interdisciplinary 
researchers in assessment ‘silos’ where 
they are unrealistically measured against, 
and by, people in a single discipline. 

A second challenge is the disjunct between, 
on one hand, rhetoric encouraging inter- 
disciplinary research and, on the other, the 
lack of institutional structure and support for 
it. Although we are encouraged to workin 
interdisciplinary environments and to join 
interdisciplinary research clusters, we face 
numerous administrative hurdles. Cross- 
enrolment of interdisciplinary students 
is seldom acknowledged, and adequate 
resources and structures — such as guidance 
on writing for interdisciplinary audiences, 
or longer candidatures for postgraduate 
students — are rarely provided to support 
the interdisciplinary researcher. 

It would be simple for institutional leaders 
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to ask current interdisciplinary researchers 
about the challenges they face and to 
document these issues. These leaders could 
then address the issues by formalizing the 
interdisciplinary researcher role and reducing 
demands to satisfy the needs of multiple 
disciplines. Supportive environments must be 
created if we are committed to achieving 
interdisciplinary research goals. 

James A. Smith*;, Gemma E. Carey*+ 

*Discipline of Public Health, School of Population 
Health and Clinical Practice, University of Adelaide 
+Discipline of Medicine, School of Medicine, 
University of Adelaide 

+Discipline of Anthropology, School of Social 
Sciences, University of Adelaide, South Australia 
5005, Australia 


Readers are welcome to comment at http:// 
blogs.nature.com/nautilus/2007/06/ 
creating_an_interdisciplinary.html 


Limitations of molecular 
genetics in conservation 


SIR — Your News Feature “The species and 
the specious’ (Nature 446, 250-253; 2007) 
provided an interesting assessment of recent 
research on genetics, species taxonomy and 
conservation. 

Although mitochondrial DNA (mtDNA) 
and other molecular genetic data are 
informative, they must be viewed in the 
context of natural history and population 
biology. A strictly phylogenetic approach 
using genetic data may not consider the 
limitations of gene phylogenies or the 
relevance of organism-level data. The 
sciences of systematics, population genetics, 
phylogenetics and taxonomy require 
assessment of different types of data. As you 
note, boundaries between groups within 
species are not always clear, which has led to 
extensive assessment of the appropriate units 
for fish and wildlife management and 
conservation. I suggest that management 
should focus on a species’ occurrence in 
geographical areas rather than seemingly 
endless debate over vague terms such as 
genetic discreteness or evolutionary legacy, 
and proliferation of new intraspecific 
terminology for what are essentially 
populations. 

One example of this debate is provided in 
your News Feature, in which you note that 
there is similar mtDNA in polar bears and 
brown bears that brings their status as species 
into question. However, morphology, 
behaviour and habitats show these to be 
different species regardless of their mtDNA 
relationship; therefore management of polar 
bears and brown bears as separate species 
is appropriate. 

The limitations of genetic data are apparent 
from the contrasting patterns of similar 
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mtDNA in different species (polar bears and 
brown bears) and divergent mtDNA within 
populations of one species, black bears 

(M. A. Cronin et al. Can. J. Zool. 69, 
2985-2992; 1991). 

Matthew A. Cronin 

University of Alaska Fairbanks, School of Natural 
Resources and Agricultural Sciences, 

Palmer Research Center, 533 East Fireweed 
Avenue, Palmer, Alaska 99645, USA 


Information from patent 
office could aid replication 


SIR — Your News Feature “The hard copy’ 
(Nature 446, 485-486; 2007) accurately 
highlights the limited availability of 
information on stem-cell research 
methodologies — owing to competition 
among labs, the commercial value of such 
information and space restrictions in 
high-quality journals — which contributes 
to other labs’ inability to replicate and verify 
the results. 

It might sometimes repay scientists to 
look beyond conventional journals for 
information, in this or other disciplines, 
particularly to patents or patent applications. 
Thanks to the strict enablement requirements 
of patent law and patent offices in relation to 
inventions, one can often find more detailed 
methodology in patent documents than in 
journals with severe page limits. 

A very good example of comprehensive 
detail in certain non-embryonic stem-cell 
methodologies is a PCT application 
WO/2006/028723 (Non-Embryonic 
Totipotent Blastomer-Like Stem Cells and 
Methods Therefor), which includes surgical 
procedures in organ removal, isolation of 
cells, and composition and preparation of 
culture media. In this instance, the level of 
detail and volume of text relating to 
methodology far exceeds that which many 
peer-reviewed journals can accommodate. 

Some journals publish methodology and 
protocols online as Supplementary 
Information to the main paper or in separate 
publications (an example is Nature Protocols, 
which encourages user comments). Often, 
though, journals are only starting points in 
complex paper trails related to methods. In 
these circumstances, patent documents could 
contain the most methodology related to an 
invention ina single document. 

Harry Thangaraj 

Centre for the Management of Intellectual 
Property in Health Research and Development 
(MIHR), Oxford Centre for Innovation, 

Mill Street, Oxford OX2 OJX, UK 


Science publishing issues of interest to 
authors are regularly featured at Nautilus 
(http://blogs.nature.com/nautilus), where 
we welcome comments and debate. 
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Scot on the rocks 


Michael A Taylor 
National Museums of Scotland Publishing: 
2007. 144 pp. £12.99 


The Cromarty Firth in Scotland currently har- 
bours huge oil-drilling platforms. At night, the 
town of Cromarty offers a spectacular ballet of 
illuminated Eiffel Towers swaying silently on 
the sea, opposite the shore where, around 1830, 
Hugh Miller (1802-1856) took his first steps in 
geology by collecting fossils. The oil from the 
North Sea and the platforms are symbols of 
utilitarian geology, but few people today know 
of Miller’s role in defending and popularizing 
the notion of long geological time and the use 
of fossils for dating rocks, thereby making 
Victorian economists realize that “geology... 
has also its cash value”. 

Michael Taylor’s Hugh Miller is a remarkable 
account of the life of this extraordinary Scots- 
man, known by scientists for his role in the early 
history of palaeontology and geology, and by 
Scotsmen for his writings about Scottish folk- 
lore, history and nature, as well as for his role in 
the disruption that led to the birth of the Free 
Church of Scotland (the ‘Kirk) in 1843. Much 
has been written about the various aspects of 
Miller’s life, but often in an unbalanced way, 
focusing on nature, society or religion. Taylor's 
biography provides an outstanding synthesis 
of all the facets of Miller’s activities, from his 
childhood in Cromarty, his manual labour as a 
stonemason and his discoveries asa self-taught 
palaeontologist, to his career as editor of the 
newspaper The Witness and his suicide in 1856. 
Moreover, this book draws heavily on Hugh 
Miller’s writings and letters. 

Taylor reconstructs the character: how Miller, 
draped in plaid, rambled in search for fossils 
or inspiration, and how he behaved in family 
life or when socializing and debating “Kirk” 
questions. This gives colour to the austere text 
and engravings of his books Old Red Sandstone 
(1841), Footprints of the Creator (1849) and 
Testimony of the Rocks (1857), which are clas- 
sics in Britain and among the Scottish diaspora 
worldwide, and are masterpieces of Victorian 
popular science. 

Miller’s previous biographies seeded some 
myths that Taylor refutes. For example, Miller 
did not become interested in fossils because he 
was a stonemason in his youth, since the stones 
he carved were generally barren. Also, he did 
not commit suicide because he could not rec- 
oncile the long geological time with Genesis, 


Hugh Miller, reporting for The Witness newspaper on the launch of the Free Church of Scotland in 1843. 


Taylor argues, nor because he was depressed 
by the disastrous causes of the Highland Clear- 
ances. He was simply overworked and suffered 
from a brain disorder that caused him unbear- 
able headaches and dreadful nightmares. 

Miller's interest in geology was triggered 
by his findings on the shore of the Cromarty 
Firth, notably of the strange 390-million-year- 
old armoured fishes in red sandstone from the 
Devonian period, whose anatomy was then 
mysterious. John Malcolmson, a learned ama- 
teur naturalist, introduced Miller to the great 
names of geology and palaeontology, such as 
Roderick Murchison and Louis Agassiz, who 
confirmed that his findings were the earliest 
fishes known at that time. 

Miller was fascinated by these fishes from 
“a different creation’, now known as antiarchs 
(Pterichthyodes) and arthrodires (Coccosteus) 
and, as he was a talented artist and a remarkable 
observer, he provided the first reconstructions 
of their bony armour, which Agassiz praised. 
Encouraged by this international recognition, 
Miller developed a tremendous interest in the 
relative age of rock layers and the fossils they 
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contain. This field of geology, now called bio- 
stratigraphy, had just begun to be formalized 
and the succession of entirely different fossil 
animals and plants through time was generally 
interpreted as a series of catastrophes and new 
creations, an explanation that at first seemed 
to fit with Miller's religious faith. 

However, he quickly realized that a literal 
explanation of this succession of creations in 
the light of Genesis was untenable. Reject- 
ing the pre-darwinian, transformist ideas of 
his time, such as Jean-Baptiste Lamarck’s or 
Robert Chambers, he took Agassiz’ views of 
the ‘three-fold parallelisny as evidence for a 
divine plan. He believed in the classification 
of living beings based on the hierarchy of the 
characteristics — the more general the char- 
acteristics, the earlier they appear in time and 
the earlier they occur in the embryo. By this 
reasoning, the earliest fossil organisms, being 
closer to Creation, were, like early embryos, 
more ‘perfect, whereas their living representa- 
tives, like adults, were ‘degraded’ This idea of 
the ‘progress of degradation’ was widespread 
among naturalists of that time and at odds with 
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the darwinian view of evolution as a progress 
toward adaptation and fitness. 

Miller died three years before Darwin's Origin 
of Species was published. He is, as Taylor puts it, 
“regarded as a loser in the crucial evolutionary 
debate... That is simply because it never really 
began in Miller’s lifetime: But Miller, along with 
other contemporary palaeontologists, paved 
the way to evolutionary concepts. All that was 
missing was a process that did not need divine 


intervention, and Darwin provided it. 

Hugh Miller is superbly written, clear and 
readily accessible to those who have no back- 
ground in geology, palaeontology or Scottish 
history. It is to be strongly recommended to his- 
torians of science, lay naturalists and any reader 
interested in Scottish life and history. a 
Philippe Janvier is at the CNRS, Département 
Histoire de la Terre, Muséum National d'Histoire 
Naturelle, 75005 Paris, France. 


Brain botch 


The Accidental Mind: How Brain Evolution 
Has Given Us Love, Memory, Dreams, 

and God 

By David Linden 

Harvard University Press: 2007. 288 pp. 
$25.95, £16.95 


Georg Striedter 

The human brain, and hence the human 
mind, is not an optimal, designed-from- 
scratch apparatus. Rather, it is an imperfect 
amalgam of shoddy components. That is the 
central thesis of David Linden’s new book 
The Accidental Mind. Neurons are slow, leaky, 
and unreliable — hardly ideal computing ele- 
ments. The whole brain, too, is not designed 
to the plan of some omnipotent engineer. 
Instead, evolution has endowed it with plenty 
of ‘anachronistic junk. Which is why, accord- 
ing to Linden, our minds often distort reality 
and can lead us to act foolishly. For example, 
when you reach out to touch something, your 
brain filters out what it expects. This selective 
neglect of expected input allows us to focus 
on unexpected stimuli, but it can be counter- 
productive. It may explain, for instance, why 
pushing and shoving confrontations tend to 
escalate. When someone pushes you, you feel 
it more than when you push the other with 
the same force, because the sensation caused 
by your own push is largely, though uncon- 
sciously, expected by your brain. 

Linden tells his story well, in an engaging 
style, with plenty of erudition and a refreshing 
honesty about how much remains unknown. 
The book should easily hold the attention of 
readers with little background in biology and 
no prior knowledge of brains. It would make 
an excellent present for curious non-scientists 
and a good book for undergraduates who are 
just entering into the brain’s magic menagerie. 
Even readers trained in neuroscience are likely 
to enjoy the many tidbits of rarely taught infor- 
mation — on love, sex, gender, sleep and dreams 
— that spice up Linden’s main argument. The 
Accidental Mind stands out for being highly 
readable and clearly educational. No doubt, the 
human brain evolved along a constrained path 
and is, in some respects, designed imperfectly. 
Linden will send that message home. 

Regrettably, Linden neglects to cover some 
material that could have boosted his thesis. 
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impressive strides in the past 20 years, but 
instead of discussing these, Linden reiterates 
the now outdated theory that mammal brains 
evolved by adding a neocortex to a “reptilian 
brain core”. This theory is probably false, as 
most experts agree that the mammalian neo- 
cortex evolved out of a structure that exists in 
all reptiles, though the reptilian cortex does not 
have the complexity or size of its mammalian 
counterpart. Amending Linden’s analogy, one 
might say that human brains evolved not by 
the addition of new scoops to an old ice cream 
cone, but by the modification of pre-exist- 
ing scoops. This insight would actually have 
bolstered Linden’s thesis that 
brains are subject to historical 
constraints. More difficult to 
show is that the use of pre-exist- 
ing parts imposes functional 
constraints or ‘bad design. 
Linden does write about some 
functional constraints on human 
brains, such as neuronal noise. 
This is an interesting idea, but 
noisy neurons may be flawed 
mainly in comparison to stand- 
ard computer components. A 
shift in perspective suggests 
that noisy neurons, assembled 
en masse, excel at overcom- 
ing component failure (that is, 
brain lesions). Indeed, in the 
rough and tumble world of real 
organisms, fault-tolerance may 
well be more vital than ultra- 
fast, exhaustive computing. In 
other words, in order to distin- 
guish neuronal design features 
from bugs, we need to know the 
brain’s performance specifica- 
tions, which still remain debat- 
able. One could reasonably 
argue, for example, that pushing 
your opponent harder than they 
pushed you is adaptive, or good 
design in evolutionary terms, 


Particularly interesting would have been a dis- 
cussion of the various “fast and frugal heuris- 
tics” that humans use to understand the world 
(for example, if you recognize one object but 
not another, then the former is probably big- 
ger, better or more valuable). Even though such 
heuristics may sometimes yield inaccurate 
results, they evolved because they are generally 
‘good enough’ and faster to execute than ‘opti- 
mal’ cognitive strategies. This, incidentally, is 
why such heuristics are used by engineers to 
build autonomous robots. Old-style robots that 
try to analyse their world veridically by com- 
puting all costs and benefits of possible actions, 
were slow, fragile and cumbersome. The newer 
robots act foolishly in some contexts, but they 
are fast and effective in their normal terrain. In 
many ways, they imitate our brains. 

Another area Linden oddly neglects is evo- 
lutionary neuroscience. This field has made 
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because it demonstrates your 
physical combat strength efficiently. 

Linden is right to stress that brains evolved, 
but hasty to conclude that they are flawed in 
their design. We still know too little about 
the brain’s inner workings to judge how well 
it does its job. What we do know, and what 
The Accidental Mind helps us to realize, is 
that the human brain is not designed as many 
have imagined. Our brains are not hydraulic 
devices (as Descartes had claimed), phone 
switchboards or desktop computers. All those 
analogies are weak. Indeed, our predilection 
for solving problems by analogy often misleads. 
Still, analogical thinking probably worked well 
enough in our past to be selected for. Whether 
we view it as a boon or a bug depends on our 
perspective. ao 
Georg Striedter is an associate professor at the 
Department of Neurobiology and Behaviour, 
Univ. of California, Irvine, California 92697, USA. 
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Rare insights 


Madagascar's biodiversity is unique and 
imperilled. Separated from the mainland 
of Africa for 160 million years, the island 
is home to thousands of species found 
nowhere else, as detailed in The Natural 
History of Madagascar, a paperback 
edition of which was published in March 
(University of Chicago Press, $50). This 
hefty desk reference features expert 
contributions covering the history of 
scientific exploration in Madagascar, 

its geology, climate, ecology and 
conservation, as well as its plants, 
invertebrates, fishes, amphibians, 
reptiles, birds and mammals. The 

book also includes over four hundred 
illustrations and photos — including 

this one of a lowland streaked tenrec 
(Hemicentetes semispinosus), found in 
the forest of Andrambovato in the south 
east of the island. These small, insect- 
eating mammals live together in burrows 
and make subsonic calls by rustling their 
spines; like many Malagasy species they 
are losing habitat to deforestation. 


Cancer case histories 


David G Nathan 
Wiley: 2007. 272 pp. $24.95, £16.99 


Using the histories of three of his patients, 
David Nathan tells the remarkable, some- 
times frustrating, story of the development of 
modern cancer drugs. The Cancer Treatment 
Revolution is extremely well written by this 
retired, leading figure in the US cancer scene, 
who has contributed greatly to the scientific 
and clinical research. 

Its detailed account of the history of chemo- 
therapy is fascinating, even if it does suggest 
that everything was done in Boston. It is so 
difficult to imagine how the early human 
experiments could ever have been carried out 
in today’s ethical climate. Pumping little chil- 
dren full of horrible drugs to obtain just a few 
weeks survival benefit is no fun for any doctor. 
But without those pioneers and the suffering 
children, we would not have the drugs for can- 
cer we routinely use today. 

The book gives a very readable account of the 
discovery and clinical development of molecu- 
larly targeted therapies over the past ten years. 
This has been a major triumph for the rational 
application of molecular biology to one of the 
greatest unmet medical needs of our time. We 
now seem to be at the beginning ofa revolution 
— moving on from cancer treatment that blasts 
away malignant cells using toxic chemicals to 


cause widespread havoc to the patient's physi- 
ology, towards a more specific and gentler, 
patient-tailored approach. Converting cancer 
into a chronic, controllable illness now seems 
to bea distinct possibility. But we're not quite 
there yet. 

The tensions between the medical commu- 
nity, the research funders and institutes, the 
US Food and Drug Administration, the phar- 
maceutical industry and the taxpayers have 
dramatically increased and are well outlined 
by Nathan. He clearly doesn't like the politics 
— stating “science policy in the Bush admin- 
istration can only be described as absurd and 
dangerous”. Wonderful stuff — in some coun- 
tries he could find himself without a job, in 
others in jail for treason and in a few just disap- 
pear. But I'm sure most readers would like some 
elaboration. Nonetheless, there is no doubt that 
the United States has committed huge amounts 
of money to the war on cancer. Whether it has 
been a worthwhile quest remains to be seen but, 
as Nathan points out, it has heavily subsidized 
the entire biomedical research endeavour. 

The biggest difficulty with this book is 
working out exactly who it is written for. Most 
patients and their carers would find it scien- 
tifically too demanding. Even with its glossary, 
the vocabulary is more for a Nature subscriber 
than a newspaper reader. Most of it would be 
simply irrelevant to any individual with can- 
cer — of the three detailed case histories cov- 
ered, two are about extremely rare conditions 
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(gastrointestinal stromal tumour and mixed- 
lineage leukaemia). The emotional content 
all seems a little hollow — almost as though 
it’s been added as an afterthought. Doctors on 
the whole shudder when they see a chapter 
headed “Ken's story”. We tend to try to box off 
the human side of disease so we can get on with 
the business in hand: doing the best we can to 
cure or prolong life. When we read about the 
science of cancer we want it unbundled from 
the fear, concern and hope. There are plenty 
of excellent psychosocial texts around. And 
patients prefer a much more user-friendly and 
less technical communication style tailored to 
their precise clinical problem. There are some 
excellent examples of this genre from various 
cancer charities on both sides of the Atlantic as 
well as Adam Wishart’s One in Three (Profile 
Books, 2006) or Rosy Daniel's “Cancer Lifeline 
Kit” (www.healthcreation.co.uk/kit.htm). 
Maybe senior academic physicians should 
stick to writing about the science of cancer 
treatment and get others to do the populari- 
zation. Currently, the alternative medicine 
movement dominates the bookstore shelves 
on cancer. Wacky diets, bizarre relaxation 
exercises, crank healers and of course herbal 
remedies all promise a cure without any side 
effects. These texts are inspirational, conclusive 
and positive. In an era of patient choice, that 
seems to be what the customers really want. 
So while I enjoyed and learned a lot from this 
great account of the wonderful achievements 
of science and medicine, I feel it is unlikely to 
be a bestseller at the airport. 
Karol Sikora is the Medical Director of Cancer- 
PartnersUK, 21 Barrett Street, London W1U 1BD. 
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ESSAY 


DETERMINISM 


Chaos tamed 


Even though our view of the physical world has shifted from that of determinism to randomness, 
randomness itself can now be exploited to retrieve a system's deterministic response. 


Kees Wapenaar and Roel Snieder 


In the nineteenth century, the world of 
physics was one of order. Pierre-Simon 
Laplace was a key proponent of the 
deterministic Universe. In this model, 
the future is completely predictable 
if one knows the forces between all 
particles as well as their positions and 
velocities at any one moment. Take, 
for instance, a ball kicked into a for- 
est. The ball bounces repeatedly off the 
tree trunks, but if you know the origi- 
nal position of the ball, its velocity and 
the trees’ locations, you can determine 
the future motion of the ball from the 
player's initial kick. 

In the twentieth century, Heisenberg’s 
uncertainty principle shattered the deter- 
ministic dream. In the quantum world, 
only the probabilities for events are 
constrained by the laws of quantum 
mechanics. So for an atom-sized soccer 
ball kicked into the forest, the trajectory 
is not determined, but the probability for 
every imaginable trajectory is. 

Even for macroscopic systems, deter- 
minism did not survive into the twentieth 
century. At that point Henri Poincaré, in 
a visionary anticipation of chaos theory, 
showed that even tiny uncertainties in ini- 
tial conditions can grow exponentially with 
time to make motion at a later time inde- 
terminable, for all practical purposes. So, 
when the soccer ball is kicked a number of 
times in slightly different directions, it hits 
the same trees during the first few bounces 
at slightly different positions; but over time 
the trajectories diverge, and after a few 
bounces the ball may move in completely 
different ways between the trees. 

So much for particles: waves behave 
completely differently. If a referee blows 
her whistle in the same forest repeatedly 
at slightly different positions, the sound 
waves scattering between the trees change 
much less than the motion of the ball. 
One reason is that waves have an intrin- 
sic length scale, the wavelength, and any 
perturbations affecting the waves over this 
length scale are effectively smoothed out. 

Recent research has shown that acoustic 
noise can be used to synthesize determin- 
istic waves generated by a point source 
— like the whistle-blowing referee. Imag- 
ine it’s raining in the forest. Every raindrop 
excites acoustic waves that bounce among 
the trees in an apparently random fashion. 
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Nevertheless, the trees leave an imprint on 
the wave field that is characteristic for the 
forest. The unravelling of this imprint turns 
out to be surprisingly simple. Let's say that, 
instead of a whistle-blowing referee, there 
are two microphones in the forest. With 
a standard computer operation — cross- 
correlation — we can reconstruct the 
sound of the referee's whistle from the 
recorded noise of falling raindrops. 

Take one raindrop that falls in line with 
the two microphones. The sound wave it 
generates travels forward, reaching the 
nearest microphone first and then contin- 
uing to the farther one. The difference in 
the time it takes for the wave to reach each 
microphone equals the time it takes for 
the wave to travel between the two micro- 
phones. The cross-correlation of the sound 
waves recorded by the microphones pro- 
duces a signal at precisely this travel time. 
So it is as if the first microphone acts as a 
source, transmitting a weak sound wave to 
the second. This is enhanced by other rain- 
drops falling in line with the microphones; 
the rest of the raindrops do not produce 
a coherent signal. Taking all the coherent 
waves together, the first microphone acts as 
if itis transmitting the sound of the whistle 
to the second. The reproduced sound can 
be used for imaging — say, to determine 
the position of nearby trees. This principle 
presents the opportunity to do wave exper- 
iments without using active sources. 

It has been known since Albert Ein- 
stein’s seminal 1905 paper on brownian 
motion that the diffusion of a particle is 
related to the way it slows down when it is 
disturbed in some way. This principle was 
later generalized to the fluctuation-dissipa- 
tion theorem, which states that for systems 
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in thermal equilib- 
rium, the determin- 
istic response of the 
system is related to 
thermal fluctuations. 
This principle can be 
extended to systems 
so large that ther- 
mal fluctuations are 
irrelevant, such as the 
sound waves gener- 
ated by raindrops fall- 
ing in the forest. It has 
recently been shown 
theoretically that the 
principle holds for a 
wide class of linear 
systems, including electromagnetism, 
flowing media and quantum mechanics. 

Extracting the deterministic response of 
a system from noise is amazing enough, 
but there is more. According to theory, 
noise sources must be distributed homo- 
geneously throughout space, and be 
uncorrelated. That is, the raindrops must 
fall everywhere in the forest, and fall at 
statistically independent times and loca- 
tions. Astonishingly, in many applications 
the extraction of the system response from 
noise is fairly robust when noise sources 
are limited and irregularly distributed, 
probably because of the stability of wave 
propagation. 

Our view of the Universe may have 
shifted from the deterministic to the ran- 
dom, but since the turn of the last century 
physics itself has provided a less simplistic 
view. Fields generated by random sources 
can be used for imaging and for moni- 
toring of systems such as Earth’s subsur- 
face, or of mechanical structures such as 
bridges. Randomness is no longer at odds 
with determinism, it has instead become a 
new window on the deterministic response 
of the physical world. a 
Kees Wapenaar is in the Department of 
Geotechnology, Delft University of 
Technology, PO Box 5048, 2600 GA 
Delft, the Netherlands. 

Roel Snieder is at the Center for Wave 
Phenomena, Colorado School of Mines, 
Golden, Colorado 8041-1887, USA. 
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Guilt by association 


Anne M. Bowcock 


In atour-de-force demonstration of feasibility, a consortium of 50 research teams uses 500,000 genetic 
markers from each of 17,000 individuals to identify 24 genetic risk factors for 7 common human diseases. 


Mr Woodhouse, the comical hypochondriac 
of Jane Austen's Emma, takes great comfort in 
blaming his various ailments on the rain, the 
cold and an unfortunate piece of wedding cake. 
He would, no doubt, have been greatly sur- 
prised to learn that even his most rudimentary 
ailments resulted, at least in part, from genetic 
factors. Reporting on page 661 of this issue’, 
a consortium of more than 50 British groups, 
known collectively as the Wellcome Trust Case 
Control Consortium (WTCCC), asserts just 
that. In the largest study of its type so far, the 
WTCCC has examined the genetic under- 
pinnings of seven common human diseases: 
rheumatoid arthritis, hypertension, Crohn's 
disease (the most common form of inflam- 
matory bowel disease), coronary artery dis- 
ease, bipolar disorder — also known as manic 
depression — and type 1 and type 2 diabetes. 

The WTCCC study is groundbreaking 
in various respects. It not only confirms the 
involvement of some genes for which disease 
association has previously been reported, but 
it also identifies several novel genes that affect 
susceptibility to common diseases. Moreover, 
it models a successful and instructive approach 
to large-scale genomic scans of this type, show- 
ing that a set of common controls can be used 
for a variety of diseases with relatively little loss 
of analytical power. Its success also provides 
strong grounds for performing such studies on 
an even larger scale. 

The WTCCC investigators examined genetic 
variation at 500,000 different positions within 
the genomes of 17,000 individuals living in 
Britain using a genome-wide association scan 
(Fig. 1). This statistical approach compares the 
frequencies of genetic variation in disease cases 
and in healthy controls from the same popula- 
tion. Using the signal from each position as an 
indicator for the DNA sequence that surrounds 
it, genome-wide association scans examine the 
relationship between each DNA position and 
a particular trait (such as diabetes). Strong 
‘association’ between a DNA position and a 
trait marks the general locale of the offending 
alteration, even if it is not itself the cause. 

The concept of drawing an association 
between biological traits and disease is hardly 
new’, but the scope and scale that the WT'CCC 
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Figure 1| Genome-wide association scan. To 
identify genetic risk factors for common 
diseases, the WTCCC researchers' scanned DNA 
from patients (2,000 per disease) and controls 
(3,000 shared for all seven diseases studied) 
for the frequency with which they contained 
each of the 500,000 genetic markers, or single 
nucleotide polymorphisms (SNPs), from the 
human genome. After statistical evaluation of 
the data, they found that most markers showed 
very little difference in the frequency of their 
two constituent forms — or alleles — between 
controls and cases. However, some SNPs 
occurred at a greater frequency in patients. Such 
alleles (one is shown in red) can be considered a 
genetic risk factor for a particular disease. 


attained in their application of this concept is 
unprecedented. Crucial to both the success 
of this study and keeping its cost reasonable 
were DNA from large numbers of unrelated 
patients; the availability of the complete DNA 
sequence of the human genome; the subsequent 
cataloguing of a large component of variation 
in the genome in the form of single nucleotide 
polymorphisms (SNPs)’; the completion of the 
HapMap project’, which provided information 
on the statistical relatedness of SNPs; and the 
availability of high-throughput technologies 
that allowed for parallel typing of 500,000 
markers representing most of the common 
variation in the genome. 
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For the seven diseases studied by the 
WTCCC, strong statistical evidence for asso- 
ciation was obtained for 12 previously identi- 
fied genomic regions and a similar number of 
new regions. Although this WTCCC report is 
based on initial studies, independent groups” 
have confirmed the involvement of all but one 
of these most significant regions through rep- 
lication studies. Some of the other identified 
regions with less statistically significant disease 
association are also likely to be true indicators 
of genetic risk; so these will need to be further 
evaluated in additional large sets of patients 
and controls. Indeed, because the WTCCC 
data will be publicly available, they will be a 
useful resource to other groups and consor- 
tia embarking on similar efforts to investigate 
genetic-association markers in these and other 
diseases. These researchers include members of 
the Genetic Association Information Network” 
(GAIN), the Framingham Genetic Research 
Study and the Women’ Health Study. 

With many of the genomic regions identified 
by the WTCCC, the next step will be to study 
the exact nature of the disease-causing variants, 
rather than the marker SNP with which each 
is associated. From this and previous studies, 
it seems that variations leading to common 
disease are diverse; some alter the coding 
sequences of genes, others lie within their non- 
coding sequences, and some are even located 
within gene deserts — regions of a chromo- 
some that contain no genes. So understanding 
the biological function of disease-risk-associ- 
ated genomic regions will be challenging. 

Two replication studies relating to the 
WTCCC findings are also published today*’, 
revealing connections between the genomic 
regions associated with the risk of type 1 dia- 
betes and Crohn's disease and their underlying 
biology. Some of the known and newly identi- 
fied genetic risk factors for type 1 diabetes alter 
the development or function of immune cells, 
leading to aberrant recognition of pancreatic 
islet cells as foreign particles. But additional 
susceptibility genes identified recently” do not 
fit easily into this simple model. 

For Crohn's disease, one of the newly identi- 
fied® susceptibility genes is of particular inter- 
est because it is proposed to control the spread 
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of intracellular pathogens by autophagy — the 
process of cellular self-digestion. This is the 
second gene to be implicated in Crohn's dis- 
ease through involvement in autophagy; the 
first was identified earlier this year’. More- 
over, an increasing body of evidence, including 
the latest replication study’, points to defects in 
the early immune response and the handling of 
intracellular gut bacteria in the pathogenesis of 
Crohn's disease. 

The overall increase in risk (1.2-1.5 times) 
conferred by the genetic factors identified in 
the WTCCC study’ is in agreement with those 
reported by others. However, these factors are 
unlikely to explain completely the clustering of 
any of these diseases in families, and there are 
other genes (possibly many of very small effect) 
— or rare variants of genes — that are still to be 
identified for these and other diseases. 

One unexpected result of the WTCCC study 
was the identification of 13 regions with pro- 
nounced geographical variation within Britain. 
Among these regions is a large cluster of genes 
that encodes the major histocompatibility 
complex, which is well known for its function 
in the immune response and autoimmune dis- 
ease’, anda gene that is involved in lactase per- 
sistence, or the ability to digest milk'*’’. Some 
of the other regions are thought to function 
in preventing diseases such as pellagra, tuber- 
culosis and leprosy. Although the infectious 
agents responsible for tuberculosis and leprosy 
are now rare in Britain, they have left behind 
genetic footprints in the existing population 
that probably led to some degree of protection 
in the past. Several of these are also candidate 
genes for autoimmune disease’. 

Despite the magnitude and wealth of infor- 
mation that this study’ provides, other ques- 
tions about the genetic basis of common 
disease remain. The answers will become 
increasingly important as we enter an era of 
personalized medicine, in which therapy is 
tailored to an individual's genetic constitution. 
It will become crucial to discover which genes 
predispose individuals to these diseases; how 
genes interact with each other to increase the 
risk of a particular disease; and what propor- 
tion of disease is due to rare variants that would 
be hard to detect with current approaches. 

We will also want to know whether different 
patients can be stratified into subpopulations 
on the basis of genetic risk factors, and what 
role the environment has in triggering disease. 
The Genes, Environment and Health Initiative 
(GEI) of the US National Institutes of Health 
already aims to develop tools to assess environ- 
mental contribution and to answer some of the 
other questions. Ultimately, comprehensive 
answers that would allow the translation of 
genetic susceptibility into scientifically sound 
medical practice will require much larger 
patient populations, well-annotated clinical 
databases and sophisticated environmental 
assessment. One wonders what Mr Woodhouse 
would have to say to that. a 
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SPECTROSCOPY 


The magic of solenoids 


Arthur S. Edison and Joanna R. Long 


A technique known as magic-angle spinning has helped make nuclear 
magnetic resonance spectroscopy as sensitive for solids as it is for 
solutions. Inductive thinking leads to even better signal detection. 


The great strength of nuclear magnetic reso- 
nance (NMR) spectroscopy is that it can deter- 
mine, non-invasively and at atomic resolution, 
the chemistry, structure, dynamics and over- 
all architecture of samples in solid, liquid or 
even gaseous forms. The liquid version of the 
technique, solution NMR, is used routinely to 
identify small molecules, study protein struc- 
tures and dynamics, and probe intermolecular 
interactions. Solid-state NMR teases out the 
structure and properties of materials, surfaces 
and biological solids such as human tissue. But 
compared with many other analytical tech- 
niques, NMR has extremely poor sensitivity. 
A great deal of research has sought to improve 
this situation: on page 694 of this issue’, Sakel- 
lariou et al. describe a potential leap forward 
for solid-state NMR. 

When atomic nuclei with non-zero spin 


a Magnetic field 


547° 


NMR coil \ 


Rotation 


are placed in an external magnetic field, they 
become polarized, precessing rather as a gyro- 
scope does in Earth's gravitational field. When 
electromagnetic radiation of a frequency 
(energy) that corresponds exactly to that of 
the energy gap between two states of differ- 
ent polarization is applied to the sample, the 
nuclei resonate, jumping between those states. 
The accompanying gyroscopic precession of 
the spins induces a current in a conducting coil 
placed around the sample. This basic principle 
is both NMR’ blessing and its bane asa spectro- 
scopic technique: the small energies make the 
approach non-destructive, but they also make it 
difficult to distinguish the characteristic polari- 
zation (or signal) from thermal noise. 

The signal-to-noise ratio in NMR measure- 
ments can be improved by either one of two 
general routes. The first of these is enhancing 


Inner coil 


Figure 1| Inductive logic. a, In the traditional ‘magic-angle spinning’ approach to solid-state NMR, a 
spectrum of better resolution is achieved by rapidly rotating the sample, at an angle of 54.7° relative to 
the main magnetic field, within a static coil assembly. b, Sakellariou and colleagues’ alternative approach’ 
uses the inductive coupling of a smaller coil rotating with the sample to the larger static coil to produce a 
similar effect. The result is a higher sensitivity and the capability to investigate smaller samples. 
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the starting polarization. NMR resonant ener- 
gies are proportional to the strength of the 
magnetic field. Therefore, stronger magnetic 
fields improve the initial polarization and lead 
to more signal. But generating strong, uniform 
magnetic fields throughout a sample is expen- 
sive and requires considerable infrastructure, 
posing serious practical limitations. Dynamic 
nuclear polarization”? , in which the relatively 
large polarization of electrons compared with 
nuclei is transferred to nuclei, is rapidly gaining 
popularity and applicability, but requires spe- 
cialized equipment and substantial manipula- 
tion of the sample. Furthermore, it might not 
work for all samples and experiments. 

The second general route to more sensitive 
NMR is to design detection schemes that make 
better use of the polarization signal. Several 
research groups are developing procedures that 
rely on mechanical coupling of the polarization 
to very sensitive cantilevers’, or optical rotation 
ofa probe beam running through the sample’, 
to improve sensitivity. But these technologies, 
too, have limited practical application. 

Sakellariou and colleagues' build on what 
has proved to be one of the most general and 
cost-effective ways to improve the sensitivity of 
solution NMR: detecting the voltages induced 
in a coil that is optimized for and is closer to 
the sample. Such a coil is by its nature more 
efficient, because the signal-to-noise per unit 
mass of sample scales inversely with the diam- 
eter of the coil®. Furthermore, the ‘filling fac- 
tor’ — the volume within the coil that is taken 
up by the sample — is an important variable. 
In solution NMR, solenoidal microcoils have 
been used to analyse liquid sample volumes 
of a few nanolitres’ and to perform magnetic 
resonance microscopy of individual neurons’. 
Systems for analysing volumes of 1-10 micro- 
litres are available commercially, and the ability 
to reduce the sample size has also allowed for 
the collection of many NMR spectra simul- 
taneously”. As well as being highly sensitive, 
solenoidal coils are quite easy to construct on 
avery small scale. 

Until now, however, solid-state NMR had 
not enjoyed as much benefit from microcoils 
as had solution NMR. Unlike molecules in 
solutions, those in solid samples do not tumble 
rapidly or isotropically on the NMR timescale. 
The anisotropic interactions provide impor- 
tant structural information, but they also lead 
to broad, nondescript NMR spectra that are 
intractable to analysis. This problem can be 
countered, and solid-state spectra can achieve a 
resolution similar to that of their solution NVR 
counterparts” bya trick known as magic-angle 
spinning (MAS), in which the sample is rotated 
at a speed of several kilohertz and at an angle 
of 54.7° relative to the magnetic field. In tradi- 
tional MAS NMR, the sample is spun in a rotor 
within a static assembly containing a fixed coil 
that is some distance from the sample and 
therefore has a poor filling factor. 

Sakellariou and colleagues’ simple advance’ is 
to wind a solenoid microcoil directly around the 


sample — greatly improving the filling factor — 
and to spin the sample and coil together (Fig. 1). 
The spinning microcoil couples inductively to 
a coil that, just as in the conventional approach, 
remains static in the surrounding assembly. 
This ‘magic-angle coil spinning’ (MACS) 
technique uses existing commercial MAS 
solid-state NMR probe technology, while offer- 
ing the advantages of small coil size and excel- 
lent filling factor that have been the province 
of solution NMR for over a decade’. 

The authors’ set-up can improve the signal- 
to-noise ratio by about an order of magnitude, 
and so allows smaller samples to be studied. 
The microcoil can also significantly increase 
the radio-frequency fields for a given current 
within the static coil, allowing more efficient 
manipulation of the spin polarization with 
radio-frequency pulses. The MACS technique 
has many conceivable applications, including 
structural measurements of very small pro- 
tein samples, ‘metabolomics’ studies of the 
biochemistry of microscopic tissue extracts, 
and NMR measurements of radioactive mate- 
rials that must be contained by specialized 
barriers’. 

As with any technology, not all samples will 
be ideal candidates for the approach. This is 
especially true of samples in which the signal 
of interest is present in limited concentrations, 
such as those for trace amounts of metabolites 


in tissue, or membrane proteins that aggregate 
at higher concentrations. At low concentra- 
tions, the amount of material will still need 
to be increased. But for many applications in 
chemistry, biology and materials science, Sakel- 
lariou and colleagues’ advance opens up new 
opportunities simply by reducing the amount 
of material required for solid-state NMR stud- 
ies, without needing to invest substantially in 
new technologies. a 
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CLIMATOLOGY 


Tempests in time 


James B. Elsner 


The frequency of severe hurricanes in the North Atlantic has increased 
during the past decade. Scrutiny of the prehistoric record left by such 
storms helps to assess the factors contributing to hurricane activity. 


A hurricane is a product of its environment: a 
warm ocean provides sustenance; calm atmos- 
pheric conditions nurture an infant storm; and 
a high-pressure cell in the subtropical atmos- 
phere drives it in a given direction. Increases 
in oceanic heat from global warming will raise 
a hurricane’s potential intensity, all else being 
equal. Yet increases in wind shear — in which 
winds at different altitudes blowing in different 
directions may tear apart the developing storm 
— could counter this tendency by dispersing 
the storm's heat. 

In the long run, which effect will win out? 
Limited instrumental records of hurricanes 
and climate change make it difficult to answer 
this question. So researchers have turned to 
prehistoric ‘proxy’ data to uncover clues about 
what to expect in a warmer world. Two new 
papers, one published on 24 May’ and one 
on page 698 of this issue’, illustrate how the 
approach has been applied to hurricanes in the 
North Atlantic. 


©2007 Nature Publishing Group 


Palaeotempestology is the study of prehis- 
toric storms from geological and biological 
evidence. Coastal wetlands and lakes are sub- 
ject to ‘overwash’ during hurricanes, when 
barrier sand dunes are surmounted by storm 
surge. The assumption is that the waves and 
wind-driven storm surge reach high enough 
over the barrier to deposit a fan of sand in the 
lake*. A sediment core from the bottom of the 
lake shows that fan as a sand layer distinct from 
the fine organic mud that accumulates slowly 
under normal conditions. 

Donnelly and Woodruff! analysed sedi- 
ment cores they extracted from a lagoon on 
the Puerto Rican island of Vieques. The lagoon 
is separated from the ocean by a stable barrier 
of sand. In the core, they found coarse-grained 
sand layers embedded in several metres of 
organic-rich silt. The layers are clearly the 
result of barrier and nearshore sediments that 
have been washed into the lagoon by strong 
hurricanes, the recent layers being correlated 
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in time with known hurricane strikes. The 
authors calibrated the sensitivity of the site to 
storm surge by noting the intensity of known 
strikes that did not leave sand in the core. 

Donnelly and Woodruff find more sand lay- 
ers during the latter half of the Little Ice Age. 
This occurred between 300 and 150 years ago, 
and towards the end of this interval sea tem- 
peratures near Puerto Rico were 2 °C cooler 
than they are now. The authors say this is evi- 
dence that today’s warmth is not needed for 
increased storminess. Not surprisingly, they 
find that intervals in which more hurricanes 
occurred correspond with periods of fewer El 
Nijfio events. El Nifio events suppress hurricane 
activity in the North Atlantic by increasing the 
amount of wind shear and sinking air. 

Nyberg et al.” describe a different approach 
that has led them to the same conclusion 
— that, in the long run, shear is more impor- 
tant than ocean temperature in modulating 
hurricane activity. They use proxy records 
of shearing winds and ocean temperature to 
reconstruct a two-and-a-half century record of 
major hurricanes and wind shear. The proxies 
are based on luminescence banding in coral 
cores retrieved from sites in the northeastern 
Caribbean, and on a marine sediment core 
from further south. 

However, studies relying on a spatially lim- 
ited set of coring and proxy locations are not 
able to resolve changes in hurricane tracks. The 
northeastern Caribbean is in the direct path of 
hurricanes today, but has it always been? More 
hurricanes occurring locally could mean a shift 
in their direction rather than their abundance. 
Donnelly and Woodruff’ find that changes in 
hurricane frequency over the northeastern 
Caribbean seem to mirror the changes in fre- 
quency inferred from cores collected in New 
York, but the degree of correlation is not quan- 
tified. Proxy data from the Gulf coast show a 
pattern of frequent hurricanes between 3,800 
and 1,000 years ago, followed by relatively few 
hurricanes during the most recent millennium, 
which has been explained in terms of the shift- 
ing position of the subtropical high-pressure 
zone*. Unravelling the causes of changes in 
local hurricane activity requires an understand- 
ing of the factors that influence what track they 
will take’. So further work is needed. 

In addition, the assumption that hurricanes 
are simply passive responders to climate change 
should be challenged. Hurricane activity influ- 
ences the observations and proxies used to 
compute mean quantities such as wind-shear 
and precipitation conditions, so the arguments 
can easily become circular. Reduced rainfall 
and greater mean shear are possible conse- 
quences of fewer hurricanes, not necessarily the 
causes. More importantly, a hurricane removes 
heat and water from the ocean and transports 
them upward and poleward, thereby modify- 
ing the environment that supports it. A strong 
hurricane cools the ocean surface beneath it 
as a result of evaporation and mixing of water 
layers. This makes the area less favourable for 


Figure 1| One for the modern record. Hurricane Katrina made its infamous assaults on the Bahamas, 


Cuba, south Florida and the Gulf coast in late August 2005. 


the next storm, but at depth adds heat to the 
ocean that can, in the long run, influence the 
climate system’. 

Palaeotempestology is a valuable tool for 
answering questions on hurricane climatology. 
But more records are needed before localized 
prehistoric activity can be used to make sense 
of large-scale patterns of storminess. As Liu’ 
has pointed out, each record serves asa ‘palaeo- 
weather station; sensitive only to nearby hurri- 
canes. At present, fewer than a dozen sequences 
that have been dated and validated are available 
in hurricane-prone regions of the United States 
and Caribbean. However, the new analyses’ 
and those of others*” are a start. 

When more palaeoweather stations have 
been established, a network can be constructed 


with links connecting sites that share similar 
periods of storminess. That network can then 
be compared to a network of storminess from 
modern records (Fig. 1) to better understand 
the evolving mechanisms responsible for 
changing hurricane risk. a 
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STEM CELLS 


Recycling the abnormal 


Alan Colman and Justine Burley 


Using human eggs in the quest to make donor-specific embryonic stem 
cells is controversial. A method developed in mice, if applicable to humans, 
could eliminate the need to obtain eggs for this purpose. 


On page 679 of this issue, Egli et al.’ describe a 
promising method for generating embryonic 
stem-cell (ESC) lineages using the technique 
of somatic-cell nuclear transfer (SCNT). Con- 
ventional SCNT involves replacement of the 
nuclear genetic material of an unfertilized egg 
(oocyte), with that of a somatic (non-germ) 
cell. After ‘fertilization, which is induced by 
chemical or electrical triggers, the embryo 
undergoes several rounds of cell division and, 
after implantation into a foster mother, may 
develop to term. So far, this technique has been 
used successfully to clone 12 species. It also has 
been used in mice to generate ESCs from a 3.5- 
day-old mouse embryo’ — a blastocyst. 


©2007 Nature Publishing Group 


Since Dolly the Sheep was cloned by SCNT 
more than ten years ago’, it has been hoped 
that this technique would serve to create 
patient-matched ESCs for therapy, and human- 
disease-specific ESC lines for use in basic 
research and drug development. However, in 
contrast to SCNT in mice, the use of this tech- 
nique in humans has been thwarted by techni- 
cal difficulties, as well as logistical and ethical 
concerns about obtaining oocytes. Now, Egli 
and colleagues’ describe a different approach 
to produce donor/disease-specific ESC lines 
that may well revolutionize the field of human 
stem-cell research, and that removes one of 
the main ethical objections to such work. The 
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Figure 1 | Somatic-cell nuclear transfer using abnormal embryos. Egli et al.' generated abnormal mouse zygotes, in which the egg was fertilized with two 
sperm cells (n indicate pronuclei). Such zygotes are often useless by-products of human in vitro fertilization procedures. However, using inhibitors, the authors 
allowed the abnormal mouse zygote to progress through the cell cycle up to the point during mitosis at which its chromosomes aligned on the mitotic spindle. 
They then mechanically removed the spindle and replaced it with the condensed chromosomal content of a donor embryonic stem cell (ESC), also arrested at 
mitosis. Removal of the inhibitors allowed development to resume, and a blastocyst formed. This is a promising technical feat, as the authors also found that 
blastocysts formed in this way but using ESCs or adult tail-tip cells as donors and normal zygotes as recipients led to live offspring or, alternatively, new ESCs. 


crux of their contribution is the use of fertilized 
eggs, instead of oocytes, as SCNT recipients. 

Historically, fertilized mouse eggs at the 
one-cell stage — the zygote — have been suc- 
cessfully used as recipients of nuclear genetic 
material’, but only when the donor cells were 
also zygotes and not from later developmen- 
tal stages*. Possible reasons for this limitation 
include loss of essential non-DNA factors with 
the removed genetic material®, and inadequate 
time for the reprogramming of the donor’s 
genetic material in its new environment”. 

Egli et al. reasoned that the loss of the cru- 
cial factors could be minimized or eliminated 
if nuclear transfer is conducted when both the 
recipient zygote and the donor cell are tem- 
porarily arrested at the mitotic cell division. 
To test this, they used the drug nocodazole 
to arrest mitosis in mouse zygotes at the stage 
when chromosomes condense. Replacing noco- 
dazole with another inhibitor allowed chrom- 
osome alignment along the mitotic spindle, 
but prevented further cell-cycle progression. 
The spindle could then be seen using optical 
devices, and removed mechanically. 

Donor zygotes and two- and eight-celled 
embryos were also arrested with nocodazole. 
The condensed chromosomes were then 
identified, removed from individual cells, and 
injected into the cytoplasm of treated recipi- 
ent zygotes. Removal of the inhibitors allowed 
development to resume, and the resulting 
blastocysts were returned to foster mothers. 
Donors from all three stages of development 
led to some live births. 

Next, the authors used mouse ESCs as 
donors. The resultant blastocysts were either 
returned to foster mothers — leading to nine 
live births — or were used to make new ESC 
lines. By injecting these SCNT-derived ESCs 
into normal host blastocysts they showed that 
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these cells had the full range of developmen- 
tal potencies expected from bona fide mouse 
ESCs. 

Finally, adult tail-tip cells were used as recip- 
ients to make donor-specific SCNT-derived 
ESCs. Previously, mitotic, embryonic’ and 
somatic cells* have all been used as donors in 
nuclear transfer experiments. But Egli ef al. are 
the first to use a mitotic cytoplasm as a recipi- 
ent. Using cells at this stage of the cell cycle 
as recipients may expedite reprogramming of 
donor chromosomes, because at other stages 
reprogramming factors are probably seques- 
tered within cells’ nuclei’. 

In terms of efficiency, the method reported 
by Egli et al.’ is not better than previous ones. 
So why all the excitement? After all, the new 
method seems to be ethically inferior, as in 
generating SCNT-derived ESCs, two, rather 
than one, developing embryos are disrupted 
— the original zygote and the SCNT-derived 
embryo. The answer can be found in the results 
of their last experiment. 

The researchers generated an embryo con- 
taining three sets of chromosomes — in which 
two sperm cells fertilized a single oocyte. Such 
embryos never develop normally. Neverthe- 
less, replacement of these three sets of chro- 
mosomes with one set from an ESC led toa 
normally developing embryo, which could 
potentially be used to generate a new ESC line 
(Fig. 1). This finding could have a profound 
effect on developing a viable and tractable 
method of SCNT in humans. 

The failure of SCNT in humans and monkeys 
has been attributed by some” to fundamental 
differences between primate and non-primate 
unfertilized eggs in the way their spindles form 
during cell division. However, even if this dif- 
ficulty could be surmounted, obtaining freshly 
ovulated human oocytes would remain of 
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logistical and ethical concern; unfortunately, 
in contrast to recent success in mice’’, aged, 
unfertilized oocytes — a by-product of normal 
in vitro fertilization (IVF) procedures — have 
been inadequate for SCNT in humans’”. How- 
ever, if the technique developed by Egli and col- 
leagues could be used successfully in humans, 
all of these problems would be circumvented. 
It is estimated that 3-5% of fertilized human 
zygotes contain supernumerary sets of chro- 
mosomes”’. Such zygotes are always excluded 
from clinical use in IVF centres because they 
cannot develop, and are therefore disposed 
of. The possibility of recycling non-viable 
zygotes to produce ESC lines obviates the need 
for oocyte donation. So those who have been 
troubled by this ethical aspect of human SCNT 
stem-cell research will be very encouraged by 
the results of Egli and his colleagues’. a 
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ASTROPHYSICS 


Gravitational waves constrained 


Michele Maggiore 


Cosmic gravitational waves could provide unprecedented information on 
the early Universe. The effects that are of interest are small, but experiments 
are gradually achieving a sensitivity that will test cosmological models. 


Gravitational waves are tiny disturbances in 
space-time. They can be triggered during 
cataclysmic events involving stars or black 
holes, and they could even have been gener- 
ated in the very early Universe, well before 
any star formed, merely as a consequence of 
the dynamics and expansion of the Universe. 
In the latter case, these waves should provide 
a ‘background’ signal of gravitational waves 
coming from all directions in space — if indeed 
they can be spotted. One particularly sensitive 
experiment recruited to the search for gravita- 
tional waves is LIGO, the Laser Interferometer 
Gravitational- Wave Observatory. It has just 
published the results from its fourth bout (S4) 
of data-taking in The Astrophysical Journal’. 
Gravitational waves are not the only known 
source of cosmic ‘noise. Most famously, the 
Universe is filled with a background of elec- 
tromagnetic radiation left behind by the hot 
Big Bang; it has now cooled to its present tem- 
perature of about 2.7 kelvin by the subsequent 
expansion of the Universe. The discovery of 
this ‘cosmic microwave background’ by Arno 
Penzias and Robert Wilson in 1964 is a mile- 
stone in the history of modern cosmology, and 
its detailed study provides some of our best 


information on the early Universe. In 1992, 
NASA’ Cosmic Background Explorer (COBE) 
satellite reported its measurement of the spec- 
trum of the microwave background and found 
it to have a perfect ‘black-body’ form with a 
characteristic temperature that has tiny vari- 
ations across the sky — the ‘seeds’ for galaxy 
formation’. Subsequent experiments, in par- 
ticular NASAs follow-up WMAP (Wilkinson 
Microwave Anisotropy Probe) mission, have 
provided a more detailed picture, and ushered 
in an epoch of precision cosmology, in which 
the agreement between experimental data and 
theoretical models can be at the level of a few 
per cent. 

The discovery of a cosmological background 
of gravitational radiation would arguably be 
even more fundamental. Any background of 
relic particles provides us with a snapshot of 
the Universe at a very definite time: the time at 
which these particles decoupled from the pri- 
mordial plasma. For the photons of the cosmic 
microwave background, this happened when 
the Universe was just 270,000 years old. The 
photons we see today in the cosmic micro- 
wave background are a true photograph of the 
Universe at that age. 


Figure 1 | Long arms for gravitational waves. The LIGO site at Hanford, Washington. 
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50 YEARS AGO 

An investigation during 1954-56, 
into hygiene in restaurants 

and public houses, was then 
described... In the first survey 
covering fifty representative 
kitchens only twenty-seven 

out of 260 washed utensils 
examined attained the United 
States Public Health Standard. 
Only two from forty-two drying 
cloths showed less than 500 
organisms per square inch. 
Only seven from forty-two wash 
and rinse waters yielded less 
than 500 organisms per ml. 

74 per cent of kitchens yielded 
feecal Bacterium coli from one or 
more items but no recognized 
types of food-poisoning 
pathogens were isolated apart 
from Staphylococcus aureus... 
Arrangements in many kitchens 
were poor, the paramount need 
being for improved ventilation 
and more hot water at 180° F. 
From Nature 8 June 1957 


100 YEARS AGO 

The Origin of Radium by 

E. Rutherford — In a previous 
letter to Nature (January 17) 

| gave an account of some 
experiments which | had made 
upon the growth of radium 

in preparations of actinium. 

...| think we may [now] safely 
conclude that, in the ordinary 
commercial preparations of 
actinium, there exists anew 
substance which is slowly 
transformed into radium. This 
intermediate parent of radium 
is chemically quite distinct from 
actinium and radium and their 
known products, and is capable 
of separation from them. 

It is not possible at present to 
decide definitely whether this 
parent substance is a final product 
of the transformation of actinium 
or not. It is not improbable that 
it may prove to be the long- 
looked-for intermediate product 
of slow transformation between 
uranium X and radium, but with 
no direct radio-active connection 
with actinium. If this be the case, 
the position of actinium in the 
radio-active series still remains 
unsettled. 

From Nature 7 June 1907 
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The more weakly a particle interacts, the 
earlier it detaches itself from the primordial 
plasma. Weakly interacting neutrinos, for 
instance, decoupled when the Universe was 
only about a second old. Because the gravi- 
tational force is so very small in the realm of 
elementary particles, the interaction of gravi- 
tational waves with the primordial plasma is 
negligible — they have been propagating freely 
ever since they were generated. In particular, 
gravitational waves produced during the Big 
Bang would carry a genuine picture of the Big 
Bang itself, providing information that no 
other messenger can carry. 

LIGO, together with its European counter- 
part VIRGO near Pisa, Italy, is the most ambi- 
tious project to date to search for gravitational 
waves. It consists of three detectors, two ona 
site in Washington (Fig. 1) and one, 3,000 kilo- 
metres away, ona site in Louisiana. The passage 
ofa gravitational wave would cause a tiny delay 
in the passage of laser beams reflected up and 
down LIGO’s 4-kilometre-long detector arms. 
Although LIGO has not made a positive detec- 
tion of gravitational waves, the upper bound 
on the intensity of a random background is an 
interesting result in its own right. 

The strength of the gravitational-wave 
background is quantified by its energy density, 
Pew: In cosmology, there is a natural unit for 
energy density, the critical density for “clos- 
ing’ the Universe, p.. If the Universe’s density 
is greater than p,, the force of its own gravity 
will at some point cause it to begin contract- 
ing, ending in a reverse of the Big Bang, the 
Big Crunch; if, however, it is smaller than this 
critical density, the Universe’s expansion will 
continue unchecked for ever. It is thus con- 
venient to use the ‘normalized’ energy density, 
Quyw= Pew! Pc: LIGO’s latest upper bound’ for 
the stochastic gravitational-wave background 
is Q,,< 6.5 x 10°. 

This experimental limit is interesting 
because it represents a sensitivity at which cur- 
rent models of cosmology tell us the detection 
of gravitational waves is not excluded. Upper 
bounds on Q,,, can be deduced from various 
astrophysical and cosmological observations’. 
In the frequency range accessible to LIGO, the 
most important limit comes from the produc- 
tion of light elements other than hydrogen in 
the first few minutes of the Universe, known 
as Big-Bang nucleosynthesis. The abundance 
of these light elements is fixed by the balance 
between the rate of the nuclear reactions that 
produce them and the expansion rate of the 
Universe, which dilutes them. 

The latter rate is determined, through the 
equations of general relativity, by the total 
energy density of the Universe. This consists 
of the energy density carried by the known 
elementary particles, plus the energy density 
carried by any other more exotic form of matter 
— or indeed by gravitational waves. When only 
the known elementary particles are included in 
the computation, theory and observation agree 
beautifully. The energy density of any other 
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extra particle, and of gravitational waves, at the 
epoch of nucleosynthesis, is then constrained 
in order not to spoil this agreement. This puts 
an upper bound on Q,,, at the level of a few 
times 10°, which is comparable to LIGO’s new 
upper bound’, 

Certain cosmological models predict val- 
ues of Q,,, that could be almost as large as is 
allowed by the nucleosynthesis bound. In par- 
ticular, a pre-Big-Bang model that includes the 
low-energy effects of string theory* predicts a 
stochastic background of gravitational waves 
that, for some values of its input parameters, 
approaches this bound’. Such cosmological 
models are thus now seeing experimental 
constraints. 

The data from LIGO’s S4 run that have now 
been published’ were taken over a period of 
one month, between February and March 
2005. The duration of the data-taking is a 
major factor because of the way the stochas- 
tic background is extracted by correlating the 
data of two detectors. This procedure allows 
the gravitational-wave signal to be extracted 
from the much greater effect of noise local 
to the detectors — laser fluctuations, seismic 
rumblings and so on. This signal-to-noise 


ratio scales as the square-root of the total 
observation time. 

LIGO is now engaged in its fifth period of 
data-taking (S5), which will collect one year 
of coincident data between its detectors, with 
an improved sensitivity over the $4 data’. 
The combination of better sensitivity and the 
longer run is expected to improve the sensi- 
tivity to O,,, by a factor of a further 10-100. 
In a few years, an upgrade of the experiment, 
known as Advanced LIGO, should eventually 
reach sensitivities between 10 *and 10°. That 
should allow us to penetrate deep into a totally 
unknown region, where the answers to funda- 
mental questions could well be waiting. | 
Michele Maggiore is in the Department of 
Theoretical Physics, University of Geneva, 

1211 Geneva 4, Switzerland. 
e-mail: michele.maggiore@physics.unige.ch 
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DISEASE ECOLOGY 


The silence of the robins 


Carsten Rahbek 


A continent-wide analysis suggests that West Nile virus has severely 
affected bird populations associated with human habitats in North America. 
The declines parallel patterns of human disease caused by the virus. 


Scenes reminiscent of those described in 
Rachel Carlson's Silent Spring' have been 
occurring in suburban America. This time, it 
is not pesticides that are to blame for a decline 
in bird populations, but outbreaks of West Nile 
virus’. A study by LaDeau and colleagues’ on 
page 710 of this issue shows that reductions 
in bird populations correlate with the preva- 
lence of the virus, that these patterns are 
upheld across years and throughout the con- 
tinent, and that the patterns are geographically 
correlated with epidemics of human infection 
by West Nile virus*. 

West Nile virus emerged in New York City 
from the Old World in 1999, and then spread 
rapidly across the entire continent. The pri- 
mary hosts of the virus are birds, in which virus 
numbers are also amplified before the virus is 
transmitted by mosquitoes to the next victim. 
Besides birds, the virus can infect other verte- 
brates, including humans, and has caused the 
death of as many as 1,000 people’ in the United 
States alone, as well as uncounted casualties in 
birds and other vertebrates” (Box 1). 


*This article and the paper concerned’ were published online 
on 16 May 2007. 
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LaDeau and colleagues have dealt with 
several analytical challenges to demonstrate 
that West Nile virus is indeed the main factor 
behind the observed large-scale declines in bird 
populations. Continent-wide fluctuations of 
this kind have been documented previously”®, 
but they have been explained by changes in the 
local environment related to habitat, land use 
and climate. LaDeau and colleagues had to dis- 
entangle virus-induced mortality from these 
confounding effects. 

To do so, they designed species-specific 
predictive models based on knowledge of the 
prevalence of the virus, exposure to mosqui- 
toes and overall mortality for 20 different bird 
species, each species representing a specific 
combination of urban (human) association 
and susceptibility to the virus. The model was 
applied to 26 years of population data for six 
geographical regions to construct probability 
distributions for the expected abundance of 
each bird species in a given region before and 
after the arrival of the virus. 

The results are revealing: significant popu- 
lation changes in seven of the 20 species 
were in agreement with specific expectations 


Advance Online Publication|doi:10.1038/nature05889|Published online 16 May 2007 
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The silence of the robins 


Carsten Rahbek 


Acontinent-wide analysis suggests that West Nile virus has severely affected bird populations associated 
with human habitats in North America. The declines parallel patterns of human disease caused by the virus. 


Scenes reminiscent of those described 
in Rachel Carlson's Silent Spring’ have 
been occurring in suburban America. 
This time, it is not pesticides that are to 
blame for a decline in bird populations, 
but outbreaks of West Nile virus’. A 
study by LaDeau and colleagues’, pub- 
lished on Nature’s website today, shows 
that reductions in bird populations cor- 
relate with the prevalence of the virus, 
that these patterns are upheld across 
years and throughout the continent, 
and that the patterns are geographically 
correlated with epidemics of human 
infection by West Nile virus. 

West Nile virus emerged in New York 
City from the Old World in 1999, and 
then spread rapidly across the entire con- 
tinent. The primary hosts of the virus are 
birds, in which virus numbers are also 
amplified before the virus is transmit- 
ted by mosquitoes to the next victim. 
Besides birds, the virus can infect other 
vertebrates, including humans, and has 
caused the death of as many as 1,000 peo- 
ple* in the United States alone, as well as 
uncounted casualties in birds and other 
vertebrates” (Box 1). 

LaDeau and colleagues have dealt with 
several analytical challenges to demon- 
strate that West Nile virus is indeed the 
main factor behind the observed large- 
scale declines in bird populations. Conti- 
nent-wide fluctuations of this kind have 
been documented previously”®, but they 
have been explained by changes in the 
local environment related to habitat, 
land use and climate. LaDeau and colleagues 
had to disentangle virus-induced mortality 
from these confounding effects. 

To do so, they designed species-specific 
predictive models based on knowledge of the 
prevalence of the virus, exposure to mosqui- 
toes and overall mortality for 20 different bird 
species, each species representing a specific 
combination of urban (human) association 
and susceptibility to the virus. The model was 
applied to 26 years of population data for six 
geographical regions to construct probability 
distributions for the expected abundance of 


Figure 1| Viral victim: the American robin. Populations of this 
bird and of the American crow are among the seven species most 
clearly identified by LaDeau et al.’ as suffering from mortality 
caused by West Nile virus. The other five species for which there 
is a robust correlation between population declines and virus 
infection are the blue jay (Cyanocitta cristata), tufted titmouse 
(Baeolophus bicolor), house wren (Troglodytes aedon), chickadee 
(Poecile spp.) and Eastern bluebird (Sialia sialis). 


each bird species in a given region before and 
after the arrival of the virus. 

The results are revealing: significant popu- 
lation changes in seven of the 20 species 
were in agreement with specific expectations 
based on the direct adverse impact of virus 
infections. Although this may seem a mod- 
est effect, LaDeau and colleagues’ analyses 
deliberately included tolerant bird species that 
were unlikely to be greatly affected for various 
ecological reasons. For the species thought to 
be susceptible to West Nile virus, there was a 
disturbingly consistent general relationship 
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between the predicted effects of the 
virus and the observed declines in 
population abundance. The cor- 
relation was far from perfect. But it 
suggests that West Nile virus could 
potentially change the composition 
of bird communities across the entire 
continent. 

Strikingly, the seven bird species 
that are most clearly affected by the 
virus are all ‘peridomestic — that is, 
they are associated with human popu- 
lations, in this case those in town and 
city suburbs. Among the disappearing 
species is an icon of North American 
garden birds, the American robin 
(Turdus migratorius; Fig. 1). It is also 
thought-provoking that no fewer than 
13 of the 20 species experienced a 10- 
year population low following the 
human epidemics of West Nile virus 
in 2002-03 in the United States’. 
This is a notable observation in light 
of the debate about the spread of the 
highly pathogenic avian influenza 
virus (H5NI1 strain), and the poten- 
tial role of migratory, peridomestic 
and domestic birds as reservoirs and 
dispersers of this disease. 

LaDeau et al.’ caution against over- 
simplified interpretations of their 
results. The spatial patterns of disease 
that they detected may still reflect 
regional differences in the intensity of 
viral transmission, and these may be 
linked to spatial patterns in habitat, 
land use and climate — all of which 
are traditionally used to explain large-scale 
patterns of changes in bird populations. 

The authors partly incorporated the 
potential influence of the El Nifio/Southern 
Oscillation in their models as a crude meas- 
urement of climate variability, but their analysis 
does not include environmental or climatic 
parameters at the appropriate spatial scale. 
This may explain why, with the exception of 
the American crow (Corvus brachyrhynchos), 
the results are qualitatively rather inconsist- 
ent for individual species. But the results for 
the crow are compelling, not least given the 
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Box 1| West Nile virus in the Old and New Worlds 


West Nile virus was first 
isolated in 1937 in the West 
Nile province of Uganda”, 
an area known for other 


mosquito-transmitted 
diseases such as malaria, 
schistosomiasis, dengue and 
yellow fever. It has apparently 
only recently caused serious 
illness among humans", and 
largely outside the tropics. 


The first documented 
human epidemic occurred in 
Israel in 1951-54, with another 
in 1957. Major outbreaks 
occurred in South Africa 
(1974) and Israel (2000), 
with minor incidences in 
France (1962), Algeria (1994), 
Romania (1996) and Russia 
(1999), but only one in central 
Africa (1998). 


Following its detection in the 
New World in 1999, West Nile 
virus spread quickly? during 
the dry summer of 2002, 
and now occurs throughout 
North and Central America, 
and in the Caribbean“. It has 
killed individuals of almost 
200 bird species in North 
America, as well as other wild 
and domestic animals**. By 
contrast, it seems that infected 
Old World birds rarely show 
adverse symptoms”. 

The Old World human 
epidemics have generally been 
local and short-lived. But, as is 
the case with birds, the human 
effects of the virus in the New 
World are much more severe. 
Between 1999 and 2006, 
23,974 cases were reported in 


the United States: 14,125 were 
of the mild West Nile fever, 
and 9,849 of the severe West 
Nile meningitis or encephalitis 
(inflammation of the spinal 
cord and brain), including 962 
deaths‘. There is currently no 
vaccine for humans. 

The different effects of 
West Nile virus in the Old 
and New Worlds could be 
an example of host-parasite 
coevolution, in which the virus 
has coevolved with birds and 
humans in tropical Africa in 
particular, and so has a less 
lethal effect on its hosts”. If 
so, evolutionary adaptation 
might occur among New 
World species, which will 
minimize the virus’s future 
impact. CR. 


geographical correlation with human infection 
shown in Figure 2 of the paper’. 

More detailed analyses and studies on fur- 
ther species will be needed to fully understand 
the impact of West Nile virus on large-scale 
changes in North American bird populations. 
But even as it stands, this research reminds 
us once more of the threat of infectious dis- 
eases to both biodiversity and human health. 
The migratory passenger pigeon (Ectopistes 
migratorius) of North America, once the most 


abundant bird of its time with an estimated 
population of between 3 billion and 5 billion, 
was driven to extinction within a century by 
human agency and, possibly, diseases’. The dis- 
appearance of such an abundant species must 
have hada considerable effect on the commu- 
nities in which it occurred. Indeed, it has been 
suggested that the rise in incidence of Lyme 
disease in humans is a delayed consequence of 
the removal of the passenger pigeon from the 
ecosystems of North America’®. 
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We are witnessing the emergence of novel 
diseases at an unprecedented rate”. Epstein and 
colleagues’ have argued that human-induced 
changes in ecological systems and climate are 
now triggering “a barrage of emerging dis- 
eases that afflict humans, livestock, wildlife, 
marine organisms, and the very habitat we 
depend upon”. LaDeau and colleagues’ study is 
a timely example of the effect that such diseases 
can have on communities of wild species and 
humans alike, even at acontinental scale. ™ 
Carsten Rahbek is at the Center for 
Macroecology, Institute of Biology, University of 
Copenhagen, Universitetsparken 15, DK-2100 
Copenhagen, Denmark. 
e-mail: crahbek@.bi.ku.dk 
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Theodore H. Maiman (1927-2007) 


Maker of the first laser. 


The physicist Theodore (Ted) Maiman died 
on 5 May in Vancouver, British Columbia, at 
the age of 79. As creator of the first operating 
laser, he has left an enduring mark on science 
and technology. 

Maiman was born in Los Angeles, 
California, and showed an early aptitude 
for electrical engineering which took him 
first to the University of Colorado and 
then to Stanford University, where he was 
awarded a PhD in 1955. Subsequently, as a 
young scientist at Hughes Aircraft Company 
in Malibu, California, he worked on the 
amplification of microwaves by masers, and 
was eager to produce similar amplification 
at light wavelengths. His superiors at Hughes 
were wary of such work, wanting Maiman to 
do “something useful”. But on his insistence, 
they let him proceed. 

His breakthrough involved the use of a 
ruby crystal, which others interested in lasers 
thought would probably not work. However, 
Maiman introduced a technique that had 
not been considered, the excitation of ruby 
with an intense flash lamp. And it worked! 

A powerful red light beam was produced, 
lasting only the short time of the exciting 
flash, but nevertheless providing remarkably 
high intensity — many orders of magnitude 
more intense than any previous light source. 
And it formed a coherent, highly directed 
beam. Because the laser pulses produced 
lasted only a short time, others were eager 

to produce continuously operating lasers, 
which they soon did. But the very short 
pulses that lasers produce are themselves now 
exciting tools used in science and for wireless 
communication systems. 

Maiman initially sent a description of his 
device to Physical Review Letters. But it was 
rejected because so many manuscripts on 
masers had been submitted to the journal 
that its editors made the unusual decision 
to accept no more papers in the field. So 
Maiman sent it to Nature, where his now 
famous paper, “Stimulated optical radiation 
in ruby’, appeared on 6 August 1960 (T. H. 
Maiman Nature 187, 493-494; 1960). It was 
very brief, and I have previously commented 
that this article was probably more important 
per word than any of the papers published by 
Nature over the past century. The device was 
quickly replicated by many other scientists, 
still other types were invented, and soon 
the word laser — for ‘light amplification 
by stimulated emission of radiation’ — was 
common currency. 

Few applications for lasers were initially 
envisaged by most scientists; it was 
sometimes referred to as “a solution looking 
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for a probleny’ But the ensuing development 
of the principle produced many forms of laser 
— ranging in size from the minuscule to the 
enormous — and they have now permeated 
almost all fields of science and technology. 
Lasers are widespread in industry and 

are tools for much new science: their 

use underlies the award of several Nobel 
prizes. They are now exploited in cutting 
and welding; in communications; for 
high-precision measurements and 
convenient directional control; in 
nanotechnology; in innovative forms 

of microscopy and for manipulating 
microorganisms; in computing; and in 
medicine. 

Precise measurement of the distance 
between Earth and the Moon has been 
provided by lasers, and some scientists 
are looking for possible laser signals from 
planets around distant stars, guessing that 
intelligent extraterrestrial beings would 
use them to signal to us. Maiman himself 
was particularly pleased with the medical 
applications of lasers, such as reattachment 
of detached retinas. He did not like, 
and played down, their use as weapons. 
This was a popular idea for a while after 
the tremendous potential power of the 
technology was recognized. 

Infrared lasers (wavelength 30-1,000 um) 
have found application in detecting 
explosives and in chemical-warfare agents. 
If the word erasers’ were not already in use, 
perhaps infrared amplification by stimulated 
emission would be produced by ‘irasers: 

But the term laser actually refers to such 
systems with wavelengths up to 1 mm (above 
which the name maser takes over), and also 
to much shorter wavelengths: X-ray and 
y-ray lasers are now with us, and their likely 
further development may prompt yet further 
scientific progress. 

Stimulated emission of radiation, the 
critical process behind the laser, was first 
recognized by Albert Einstein as early as 
1918. But it was only in 1951 that its use for 
practical amplification of electromagnetic 
waves was recognized, and in 1954 the first 
such device, the maser (for ‘microwave 
amplification by stimulated emissiom), 
operating at centimetre wavelengths, was 
constructed. Art Schawlow and I pointed 
out in 1958 how the same process could be 
made to work for light waves, which then set 
offa flurry of work in many places to actually 
build such a device. 

All the earliest types of laser were 
invented in industry by recently hired young 
physicists who came from university work 
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in radio or microwave spectroscopy. This 
appropriately brought together engineering 
and spectroscopy. Maiman’s background was 
just such a case: at Stanford he had been a 
student of Willis Lamb, a Nobel laureate for 
his research on the spectrum of hydrogen. 
The next type of laser, similar to Maiman’s 
but using a different type of crystal, was 
made by Peter Sorokin and Mirek Stevenson 
at IBM in Yorktown Heights, New York. 

The next was a continuously operating laser 
produced by an electrical discharge, created 
by Ali Javan, William Bennett and Don 
Herriott at the Bell Telephone Labs in 
Murray Hill, New Jersey. 

After his invention, Maiman did early 
research in nonlinear optics, a field made 
possible by intense laser beams. He also 
formed several companies devoted to laser 
development and applications, including, 
in 1962, the Korad Corporation. His own 
account of his discovery was published in 
his book The Laser Odyssey, which was 
published in 2000. 

Theodore Maiman’s contribution, the first 
operating laser on Earth (they have now been 
found to occur naturally in astronomical 
objects) was truly historic, and has been 
widely recognized. He was chosen to bea 
member of the National Inventors Hall of 
Fame and of the US National Academies, and 
received many awards, including the Wolf 
Prize in Physics, the Oliver Buckley Prize and 
the Japan Prize. 

Charles H. Townes 

Charles H. Townes is in the Department of 
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Replicating genotype-phenotype associations 


What constitutes replication of a genotype-phenotype association, and how best can it be achieved? 


NCI-NHGRI Working Group on Replication 


in Association Studies 
The study of human genetics has recently 
undergone a dramatic transition with the com- 
pletion of both the sequencing of the human 
genome and the mapping of human haplo- 
types of the most common form of genetic 
variation, the single nucleotide polymorphism 
(SNP)'*. In concert with this rapid expansion 
of detailed genomic information, cost-effective 
genotyping technologies have been developed 
that can assay hundreds of thousands of SNPs 
simultaneously. Together, these advances have 
allowed a systematic, even ‘agnostic, approach 
to genome-wide interrogation, thereby relaxing 
the requirement for strong prior hypotheses. 
So far, comprehensive reviews of the pub- 
lished literature, most of which reports work 
based on the candidate-gene approach, have 
demonstrated a plethora of questionable geno- 
type-phenotype associations, replication of 
which has often failed in independent stud- 
ies*’. As the transition to genome-wide asso- 
ciation studies occurs, the challenge will be to 
separate true associations from the blizzard of 
false positives attained through attempts to rep- 
licate positive findings in subsequent studies. 
The purpose ofa replication study is to evalu- 
ate a positive finding from a previous study, 
to provide credibility that the initial finding is 
valid. Replication is essential for establishing 
the credibility of a genotype-phenotype asso- 
ciation, whether derived from candidate-gene 
or genome-wide association studies. However, 
there is a lack of agreement about what consti- 
tutes a finding deserving of replication, what 
constitutes an adequate replication study and 
what constitutes a replication or refutation. 
Investigators and journal editors have offered 
guidelines for how to address this problem*”, 
but these initial efforts have been hampered by 
limited experience and conflicting empirical 
data. However, as evidence has accumulated, 
several instructive examples have emerged 
of genotype-phenotype associations being 
reproduced reliably in follow-up studies. These 
include peroxisome proliferator-activated 
receptor-y (PPARG)"* and the transcription 
factor TCF7L2 (refs 14-19), related to diabetes; 
nucleotide-binding oligomerization domain 
containing 2 (NOD2) and Crohn's disease” 
complement factor H (CFH) and age-related 
macular degeneration”***; and chromosome 
region 8q24 and prostate cancer risk”*’. 
Many instances have arisen in which initial 
findings have not been reproduced in follow-up 


studies because of issues in either the initial 
study or the attempted replication* **”””. Small 
sample size is a frequent problem and can result 
in insufficient power to detect minor contri- 
butions of one or more alleles. Similarly, small 
sample sizes can provide imprecise or incor- 
rect estimates of the magnitude of the observed 
effects. Poor study design — particularly a lack 
of comparability between cases and controls 
— can increase the risk of biases because there 
can be heterogeneity in exposure to environ- 
mental challenges and population stratifica- 
tion. The latter arises when investigators fail 
to account for case-control differences in the 
genetic structure of the underlying population. 
Heterogeneity in classification of outcomes 
across studies can undermine the opportu- 
nity to compare among them. Similarly, data 
‘dredging’ can be a major problem, especially 
when criteria for defining phenotypes are 
altered to achieve statistical significance worthy 
of publication. 

Another challenge arises when follow-up 
studies analyse different variants. An example 
is the reported association between DTINBP1 
and schizophrenia, initially identified in Irish 
pedigrees™ and ‘confirmed’ in independent 
European studies™. Unfortunately, different 
risk alleles and haplotypes were reported in 
each study, making comparison difficult” ”. 
Although it is plausible that more than one 
variant could contribute to schizophrenia risk 
at the DINBP1 locus, it is difficult to draw this 
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conclusion from the literature because follow- 
up studies have not consistently analysed the 
same markers or those in perfect linkage dis- 
equilibrium (1° = 1.0). Other recent examples 
for which initial reports of association have 
been inconsistently replicated include insu- 
lin-induced gene 2 (INSIG2) and obesity”, 
and cyclic-AMP-specific phosphodiesterase 
(PDE4D) and stroke***’. These have been 
accompanied by controversies about what 
actually constitutes replication. 

This paper presents the conclusions of a 
working group on the replication of geno- 
type-phenotype associations — whether 
identified in genome-wide or candidate-gene 
studies — convened by the National Cancer 
Institute and the National Human Genome 
Research Institute. The group was composed 
of experts from diverse disciplines, including 
biostatistics, clinical medicine, epidemiology, 
genetics and scientific publishing. The purpose 
was to review the current state of the field and 
propose best practices for the design, conduct 
and publication of replication studies that aim 
to follow up notable findings, particularly in 
genome-wide association studies. The group 
addressed three topics. First, assessment of the 
validity and limitations of any single genetic 
association study. Second, criteria for establish- 
ing replication in genetic association studies. 
Third, points to consider for publication of 
high-quality genotype-phenotype association 
reports (Box 1). 
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Box 1| Points to consider in genotype-phenotype association reports 


This checklist is intended to serve as a guide for 
authors, journal editors and referees to allow 
clear and unambiguous interpretation of the 
data and results of genome-wide and other 
genotype-phenotype association studies. 


Study information 

* A detailed description of the study design and 
its implementation 

* The source of cases and controls (or cohort 
members, if based on cohort design), 
including time period and location(s) of 
subject recruitment 

* Methods for ascertaining and validating 
affected or unaffected status and 
reproducibility of classification 

* Participation rates for cases, controls or 
cohort members 

* Presentation of case and control selection 
in a flow chart, including exclusion points 
for missing and erroneous data (possibly as 
supplementary tables) 

* Initial table comparing relevant 
characteristics (such as demographics, risk 
factors and exposures) of cases and controls 

* Success rate for DNA acquisition, including 
comparisons of those with and without 
collection, extraction failures and exclusions 
due to inconsistent data 


Data issues 

* Statement on availability of results and data 
so that, as far as possible, others can analyse 
them independently 

* Links to supplemental online resources and 
database accession numbers 


Genotyping and quality control procedures 

* Sample tracking methods, such as bar- 
coding, to ensure accuracy of analysis 

* Description of genotyping assays and 
protocols, particularly when new or applied in 
anon-standard method 

* Description of genotyping calling algorithm 

* Genotype quality control design for samples, 
including numbers, plating locations, 
selection criteria for: 
* External control samples from standard 
accepted sets (such as HapMap) 
* Internal control samples (duplicate 
samples; it should be specified whether 
these are from the same or different DNA 
collection, extraction or aliquot) 


Initial association studies 

The initial study of any association represents 
an important discovery tool. In the near future, 
it is unlikely that a single study will unequivo- 
cally establish a valid genotype-phenotype 
association and not require replication. A 
number of points relating to the study design 
and reporting should be considered in deter- 
mining whether a finding in an initial genome- 
wide or candidate-gene study merits follow-up 
replication studies (Box 2). Attempts to repli- 
cate a reported association are often compli- 
cated by lack of methodological detail in the 
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* Assay and DNA quality metrics by locus, 
sample, plate or ‘batch’ 

* Assay call rates 

* Average error rates estimated by internal 
duplicates or external samples 

* Assay reproducibility: concordance for 
performance of extraction, aliquoting 
(internal control samples) and assay 
reproducibility 

* Concordance with published or previously 
generated genotypes 

* Mendelian consistency checks if related 
individuals are present 

* Detection of inconsistent or cryptic 
relatedness in study subjects 

* Evaluation of deviations from Hardy- 
Weinberg proportions to detect failed assays 
or large-scale stratification (for example, 
testing Hardy-Weinberg equilibrium 
‘violations') separately in cases and controls 

* Assessment of population heterogeneity, 
including 
* Average or median value of chi-square and 
full distribution 
* Q-Q plots of chi-square analysis and P- 
values (with specific description of type of 
test used to generate the values) 

* Validation of most critical results on an 
independent genotyping platform 


Results 

* Analysis methods in sufficient detail to 
reconstruct the analytical approach and 
reproduce all reported results 

* Description of any pre-analysis weighting 
scheme for selecting variants for replication 

* Simple single-locus and multi-marker 
(haplotype) association analyses 

* Genetic models tested (unconstrained 
genotype effects — dominant, additiv e, 
multiplicative or trend) 

* Graphical display of genotype clustering for 
assays of high interest 

* Verification of results at highly correlated loci 

* Discussion of choice of threshold for 
significance and the statistical basis for 
any adjustment for multiple testing and the 
relationship to overall study power 

* Significance of any known ‘positive controls’ 
(that is, loci established in previous genetic 
associations) 

* Consistency of results before and after 
application of quality control filters 


initial report or lack of methodological rigour 
in the original study. 

Because of the enormous number of geno- 
type-phenotype associations tested in each 
genome-wide study, spurious associations will 
substantially outnumber true ones unless rigor- 
ous statistical thresholds are applied. Although 
no universal threshold can be specified for 
statistical significance in all circumstances, 
smaller P-values generally provide greater sup- 
port for a true association. Extremely small P- 
values should be interpreted carefully, however, 
until completion of replication studies, because 
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Replication studies 

* Description of replication samples, including 
source, ascertainment and comparability to 
initial sample 

* Discussion of choice of threshold for 
significance and the statistical basis for 
any adjustment for multiple testing and the 
relationship to overall study power 

* Summary of replication and analysis attempts 
by authors 

* Summary of all known replication attempts 
by others, including non-replications 


Genotyping data and specifications for 

deposition in standard databases 

* Availability of ‘raw’ genotype data in the 
technology and vendor format, consistent 
with the requirements or restrictions 
imposed by funding agencies or informed 
consent 

* Data extraction and processing 
protocols 

* Normalization, transformation and data 
selection procedures and parameters 


Points for reviewers and authors to consider 

regarding priority for publication 

* Strength of observation 

* Suitably large sample size 

* Sufficiently stringent criteria for significance 
(small P-values) 

* High quality of study design, including 
selection of study population, reliability of 
phenotypes, measurement and adjustment 
for potential confounders 

* Discussion and conclusions commensurate 
with sample size, power, P-value and 
epidemiological quality of study design 

* Quality control standards used, including 
assessment of genotype quality and 
completeness 

* Usefulness of observations to others for 
subsequent research 

* Value of initial hypothesis described 

* Brief presentation of implications, especially 
as they relate to further follow-up both of 
genetic markers and for corroborative studies 
to investigate plausibility 

* Explanations of notable findings 

* Appropriate alternative explanations 
proposed and briefly discussed 

* Biological or functional explanations based 
firmly on available data 


many can be due to inappropriate reliance on 
asymptotic distributions of test statistics, or 
to technical artefact or genotype errors that 
are distributed differently between cases and 
controls. Cluster plots for highly significant 
markers should be examined carefully. It may 
be desirable to include confirmatory data from 
a second genotyping technology in the initial 
report to verify genotype accuracy. Cases and 
controls should be drawn from populations 
that are generally comparable both in terms 
of genetic background and environmen- 
tal exposures”, and should be analysed for 
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confounding population stratification. This 
may require genotyping of ancestry informa- 
tive markers (AIMs), which should be strongly 
encouraged as genotype costs fall and AIMs 
become increasingly well-characterized within 
marker sets. Family-based studies are affected 
by population stratification, so researchers 
should opt for methods robust to this, such as 
transmission disequilibrium methods®. They 
may be particularly valuable in the initial study 
if there is evidence for ethnic differences in the 
genetic effect of a trait, although at the cost of 
increased genotyping. Cautious interpretation 
is required either if significance is observed 
only for unusual or highly specific phenotypes 
(especially if they represent a small proportion 
of the study sample) or if significance depends 
on a particular analytical method that is not 
publicly available for confirmation. 

Approaches for dealing with multiple com- 
parisons are beyond the scope of this report, 
but more robust methods are clearly needed”. 
Permutation testing is an effective strategy to 
address the problem of multiple comparisons, 
especially if a large number of phenotypes are 
being analysed. Many methods for addressing 
the problem of multiple comparisons invoke 
a conservative approach, namely a standard 
Bonferroni correction, which assumes the inde- 
pendence ofall tests performed. In many asso- 
ciation studies, markers are not independent 
because they are in linkage disequilibrium, and 
so a standard Bonferroni correction is overly 
conservative. Lowering the threshold for call- 
ing a finding of particular variants — such as 
non-synonymous coding SNPs — positive in 
the analysis scheme (weighting) has merit but 
must be declared before initiation of the analy- 
sis and not once the analysis has begun”. The 
number of variants for which there is either 
credible laboratory evidence or a validated in 
silico prediction a priori is quite small. However, 
the temptation to create a credible biological 
hypothesis post hoc can be quite strong. 

At present, many studies are barely powered 
to identify, much less to establish, associations 
of common alleles of weak effect in complex 
diseases*’**. Recently, appreciation of this 
crucial issue has led to larger, more definitive 
studies, such as the Cancer Genetic Markers 
of Susceptibility (CGEMS) project and the 
Wellcome Trust Case Control Consortium, 
(WTCCC). An estimated large effect (that is, 
with an odds ratio greater than 2) in a well- 
powered study can lend credence to an associa- 
tion, because unknown confounding factors 
are less likely to produce large effects”’. Unfor- 
tunately, many risk variants contribute less than 
this. Small studies are prone to large variation 
in risk estimates, of which only selected strong 
positives are initially detected and reported. 
Furthermore, the estimate of the effect declines 
as replication studies are pursued, a phenom- 
enon known as ‘winner's curse’. 

Consortial studies comprised of multiple 
independent studies combined into a pooled 
analysis can be viewed as a practical approach 


FEATURE 


Box 2 | Suggested criteria for establishing the soundness of an initial association report 


These criteria are intended for studies of 
genotype-phenotype associations assessed by 
genome-wide or candidate-gene approaches. 


* Statistical analyses demonstrating the level 
of statistical significance of a finding should 
be published or at least available so that 
others can attempt to reproduce the reported 
results 
Explicit information should be provided 
about the study's power to detect a range of 
effects 
The study should be epidemiologically 
sound, with careful accounting for 
potential biases in selection of subjects, 
characterization of phenotypes, 
comparability of environmental exposures 
(when possible) and underlying population 
structure in cases and controls 
Phenotypes should be assessed according to 
standard definitions provided in the report 
Associations should be consistent (within 
the range of expected statistical fluctuation) 
and reported for the same phenotypes 
across study subgroups or across similar 
phenotypes in the entire study group 
Significance should not depend on altering 
the quality control methods beyond standard 
approaches that could change inclusion or 
exclusion of large numbers of samples or loci 


that overcomes many of the disadvantages of 
a disconnected set of underpowered studies. 
In addition, consortia may meet the need for 
rapid replication by achieving sufficiently large 
sample size**’®. Collaborations among multi- 
ple independent studies can offer important 
advantages over a single large study, particu- 
larly regarding the generalizability of findings 
observed in multiple studies that typically have 
greater diversity of populations and/or expo- 
sures. 

As far as possible, similarly rigorous cri- 
teria should be considered for evaluation 
of genotype-phenotype association studies 
with limited or no availability of subjects for 
replication, such as studies of rare diseases or 
severe toxicity due to therapy or environmen- 
tal exposures. In these circumstances, addi- 
tional information gathered from laboratory 
techniques, bioinformatic tools and a priori 
biological insight should be used to provide 
plausibility for interpreting genetic association 
findings. The expectation for demonstrated 
replication might be relaxed if it is unethical to 
attempt replication — such as in studies that 
link genetic variation with adverse effects of 
therapy or environmental exposure (for exam- 
ple, benzene or cigarette smoke). Similarly, the 
public health impact of a finding may lessen 
the stringency of expectation for replication 
before initial publication — for example, in an 
urgent situation in which effective intervention 
is available and can be readily implemented. 

Genotype-phenotype associations that have 
been replicated widely have often used clearly 
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* Measures to assess the quality of genotype 
data should include results of known study 
sample duplicates or publicly available 
samples 
The results for concordance between 
duplicate samples Cif applicable) as well as 
completion and call rates per SNP and per 
subject should be disclosed, along with rates 
of missing data 
A subset of notable SNPs should be evaluated 
with a second technology that verifies the 
same result with excellent concordance, 
because no technology is error-free 
Associations with nearby SNPs in strong 
linkage disequilibrium with the putatively 
associated SNP should be reported (and 
should be similar) 

The results of replication studies of previous 
findings should be reported even if the results 
are not significant 
Testing for differences in underlying 
population structure in case and control 
groups should be performed and reported 
Appropriate correction for multiple 
comparisons across all statistical tests 
examined should be reported. Comparison 
to genome-wide thresholds should 
be described. Similarly, for bayesian 
approaches, the choice of prior probabilities 
should be described 


defined phenotypes classified by standard and 
widely-accepted criteria, such as diabetes and 
age-related macular degeneration”””*. Use of 
accepted criteria should reduce misclassification 
rates”. Some association studies have reported 
intermediate phenotypes (known as endophe- 
notypes) but have provided little detail on the 
actual measure or its reliability®. In the absence 
of standard criteria, sufficient detail should 
be provided for both the definition of the pheno- 
types investigated and assessment of their 
validity and comparability across studies. 


Replication of initial studies 

To establish a positive replication of a geno- 
type-phenotype association, many of the same 
considerations important for genome-wide 
association or candidate-gene studies should 
be fulfilled (Box 3). In replication studies, 
every effort should be made to analyse phe- 
notypes comparable to those reported in the 
initial study. In the first attempt to replicate 
a finding, comparable populations should 
be analysed not only for the main effect but 
also to guard against confounding population 
stratification, either in the initial or replica- 
tion studies’. Because many initial studies 
and replication studies have been reported in 
populations of European descent, the challenge 
remains to extend the studies to other popu- 
lations. It has already been shown that many 
variants that have a significant association with 
disease in several studies in one population 
may not necessarily have the same association 
in another (such as TCF7L2 in West Africa and 
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Box 3 | Suggested criteria for establishing positive replication 


These criteria are intended for follow-up 
studies of initial reports of genotype- 
phenotype associations assessed by genome- 
wide or candidate-gene approaches. 


* Replication studies should be of sufficient 
sample size to convincingly distinguish the 
proposed effect from no effect 

* Replication studies should preferably be 
conducted in independent data sets, to avoid 
the tendency to split one well-powered study 
into two less conclusive ones 

* The same or a very similar phenotype should 
be analysed 

* Asimilar population should be studied, 
and notable differences between the 
populations studied in the initial and 
attempted replication studies should be 
described 


East Asia'*°*** in this case, it has provided an 
opportunity to refine the signal to a restricted 
region). In some circumstances, it might be 
impossible to conduct follow-up studies 
because of the uniqueness of a study popu- 
lation or the lack of availability of additional 
subjects for replication. If replication is not an 
option, interpretation of association findings 
could be supplemented by biological insights 
derived from the laboratory. 

Evaluation of an association in populations 
of different ancestry from that of the initial 
report would generally be expected, because 
genomic variation is greater when compared 
across populations, and should increase con- 
fidence in the finding. By contrast, failure to 
replicate in a population different from that of 
the initial report does not necessarily invalidate 
the original finding. In some cases, the differ- 
ences in linkage disequilibrium relationships 
across populations can be used to narrow the 
region of interest for later genetic and possible 
functional analysis. Owing to their robustness 
to population stratification, as noted above, 
family-based studies can also serve as valuable 
replication studies for notable findings”. 

Reports of attempts at replication should 
distinguish between tests of the same SNP as 
in the original study, SNPs in strong linkage 
disequilibrium with the reported SNP, and 
other SNPs that were genotyped to search for 
additional variants associated with disease in 
the region (Fig. 1). In some circumstances, the 
initial study might have identified a marker 
that is not in strong linkage disequilibrium with 
the causal variant, which could lead to a false 
refutation in a different population, whereas 
testing additional SNPs in the region might 
reveal another association worthy of follow- 
up. For clarity, ifnew, previously untested SNPs 
are included, they should be clearly identified 
and the rationale for their inclusion explicitly 
stated. If differences in linkage disequilibrium 
patterns across populations are used to invoke 
an association at a new marker but not at the 
originally tested marker, the different linkage 
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* Similar magnitude of effect and significance 
should be demonstrated, in the same 
direction, with the same SNP or a SNP in 
perfect or very high linkage disequilibrium 
with the prior SNP (’ close to 1.0) 

* Statistical significance should first be 
obtained using the genetic model reported in 
the initial study 

* When possible, a joint or combined analysis 
should lead to a smaller P-value than that 
seen in the initial report” 

* Astrong rationale should be provided for 
selecting SNPs to be replicated from the 
initial study, including linkage-disequilibrium 
structure, putative functional data or 
published literature 

* Replication reports should include the same 
level of detail for study design and analysis 
plan as reported for the initial study (Box 1) 


disequilibrium patterns should be empirically 
demonstrated in the appropriate populations 
and shown to be a plausible and consistent 
explanation for both the new and original 
results. Otherwise, the new association can- 
not be considered a replication. 


Publication of associations 

The evaluation of a publication addressing one 
or more genotype—phenotype associations 
is a daunting task in the age of large, dense 
datasets. To this end, published genome-wide 
association reports should include detailed 
descriptions of design, genotyping and statisti- 
cal methods, and results, even if available only 
through online supplements, or perhaps ina 
separate journal. A checklist of key possible 
issues is provided in Box 1 — this could be 
used as a guide for authors, editors, reviewers 
and the general readership. 

It is a challenge to make the case for the 
importance of the replication finding(s) with- 
out exaggerating the significance of the obser- 
vation. Remarks about possible follow-up of 
genetic markers and corroborative studies to 
investigate plausibility should be brief and well 
referenced. Authors should practise sound 
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judgement and temper enthusiasm based on 
prior publications (especially from the same 
investigative group), particularly if the replica- 
tion study results differ from those of the initial 
study. Disclosure of known previous attempts 
to replicate the reported findings, whether 
positive or negative, by the authors or others 
is important for interpreting the replication 
study. 

Although it is desirable for the initial report 
of a genotype-phenotype association to 
include adequately powered replication stud- 
ies, requiring replication with every initial 
study may not be necessary, as long as the pre- 
liminary nature of a study without replication 
is emphasized. Such studies can still provide 
valuable information if the entire set of results 
is made available, and releasing such results 
before replication would be of value to the field. 
However, there is substantial added value in 
presenting robust findings based on an initial 
scan together with follow-up replication, and 
an appropriate balance is needed that facilitates 
rapid publication of valid findings and encour- 
ages collaboration”. If replication studies are 
included, each should be described or refer- 
enced in the same detail as the initial study and 
should include the results for all SNPs tested at 
each stage. As noted above, replication studies 
should preferably investigate the same or a very 
similar phenotype. 

In many cases, the follow-up study will fail 
to replicate the initial results. Such findings 
are valuable for distinguishing false-positives 
from the true-positive signals that should be 
pursued for putative causal variants. The pref- 
erence for publishing positive findings, even 
if derived from suboptimal studies, presents 
a formidable barrier to the dissemination of 
well-conducted negative studies. Failure to dis- 
seminate results from well-conducted negative 
studies withholds essential pieces of evidence 
for investigators who may be deciding whether 
to launch a follow-up study to replicate or to 
extend the original study. Thus, high-quality 
instances of ‘meaningful negativity’ are use- 
ful and should be reported succinctly in the 
literature. Criteria for a meaningful negative 


Figure 1| Linkage disequilbrium across the region containing SNPs associated with breast cancer 

in FGFR2. Black diamonds represent four single nucleotide polymorphisms (SNPs; rs11200014, 
182981579, rs1219648 and rs2420946) for which associations with breast cancer were replicated in 
multiple studies’*”*. Estimates of the square of the correlation coefficient (1°) were calculated for each 
pairwise comparison of SNPs in the initial genome-wide association study across the FGFR2 region”’. 


The log(10) 7? values are colour-coded. 
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replication study are the same as those for a 
positive study (Box 3), with the added require- 
ments that the same trait should be studied ina 
population of comparable underlying structure 
with sufficient power to measure the appropri- 
ate effect size and yield a negative result. 

Negative studies are difficult to publish 
but they are crucial for separating true-posi- 
tive from false-positive findings. Journals are 
strongly encouraged to publish high-qual- 
ity negative studies refuting earlier positive 
reports of genotype-phenotype associations. 
The journal in which the initial scan is pub- 
lished is encouraged to solicit and publish 
well-conducted follow-up studies within a 
specified time frame, perhaps between 3 and 
9 months of the initial report. A case in point 
is the recent collection of reports published by 
The American Journal of Human Genetics”! 
that failed to replicate the initial findings of a 
genome-wide association study on Parkinson's 
disease. A handful of journals — such as Can- 
cer Epidemiology, Biomarkers and Prevention 
and the new PLoS series’* — currently feature 
well-conducted negative reports, and such 
efforts are to be lauded. The value of a well- 
executed negative study cannot be overempha- 
sized; more venues are needed to capture these 
valuable results. 

Although there are challenges to making data 
on individual research participants available to 
other investigators, every effort should be made 
to provide researchers with an opportunity to 
reproduce the reported results and to investi- 
gate new hypotheses and methods. To facilitate 
this research in genome-wide association stud- 
ies, a public data archive known as the Data- 
base of Genotypes and Phenotypes, or dbGaP 
(http://view.ncbi.nlm.nih.gov/dbgap) has been 
established at the National Library of Medicine's 
National Center for Biotechnology Information 
and will be used by many National Institutes 
of Health (NIH)-supported studies. dbGaP 
will provide study documentation and aggre- 
gated genotype and phenotype data through 
its website with no account or authorization 
required. Access to individual, de-identified 
genotype and phenotype data will require an 
authorization and approval process that is cur- 
rently under development. Whether through 
dbGaP or other venues, genotype summaries of 
computed analyses should be published online 
unless there are strong reasons not to do so, such 
as data derived from special populations (that 
is, isolated populations or minority communi- 
ties) or other groups that will not permit such 
sharing. There are substantial informatic chal- 
lenges for data presentation and data archiving, 
especially on public and journal websites. Best 
practices for retrieval and analysis of such data 
continue to evolve. 


Conclusion 

The history of genotype-phenotype associa- 
tion studies has focused on initial discoveries 
as opposed to careful replication. Earlier atten- 
tion to the appropriate design of subsequent 


replication studies might have helped limit the 
plethora of false-positive results. Determina- 
tion of valid genotype-phenotype associations 
presents a series of challenges that will require 
a logical strategy for conducting well-designed 
studies, based on excellent quality control 
practices interwoven with sound analytical 
methods and judicious interpretation. Other 
than the obvious differences in the drawbacks 
involved in multiple comparisons, standards 
for assessing the validity of the initial findings 
of a genotype-phenotype association should 
not differ substantially between the candidate- 
gene approach and genome-wide association 
studies. As experience accumulates, we can 
look forward to methodological advances 
that will facilitate our interpretation of stud- 
ies, such as continued improvement of pro- 
posed methods for lowering the threshold for 
positive findings, adjustments for population 
structure, and exploitation of linkage disequi- 
librium structure in a candidate region. 

The best practices suggested here for report- 
ing initial and replication studies are based on 
sufficient disclosure of study methods to permit 
independent confirmation of study findings. 
Often a sequence of studies will be required 
to establish a valid genotype-phenotype asso- 
ciation, perhaps involving several rounds of 
replication studies. And, of course, the conclu- 
sive demonstration of a replicated association 
represents only the beginning of the process 
towards finding the causal genetic variant(s). 
Labour-intensive and costly investigation will 
subsequently be required to sequence the can- 
didate interval in depth, genotype all the com- 
mon and perhaps uncommon variants that are 
markers for the outcomes of interest in multiple 
population samples, understand their func- 
tional consequences, examine their potential 
interactions with other genes or environmental 
factors, and devise strategies for preventative or 
therapeutic interventions. None of these steps 
should proceed far, however, without conclu- 
sive replication of findings from an initial geno- 
type-phenotype association study. a 


Note added in proof: Recently, a series of papers 
have also shown replication across as well as 
within genome-wide association studies in 
common complex diseases such as breast 


cancer, type 2 diabetes, and coronary dis- 
ease’>7476-81 
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Genome-wide association study of 14,000 
cases of seven common diseases and 
3,000 shared controls 


The Wellcome Trust Case Control Consortium* 


There is increasing evidence that genome-wide association (GWA) studies represent a powerful approach to the 
identification of genes involved in common human diseases. We describe a joint GWA study (using the Affymetrix GeneChip 
500K Mapping Array Set) undertaken in the British population, which has examined ~2,000 individuals for each of 7 major 
diseases and a shared set of ~3,000 controls. Case-control comparisons identified 24 independent association signals at 
P<5X10 7”:1in bipolar disorder, 1 in coronary artery disease, 9 in Crohn's disease, 3 in rheumatoid arthritis, 7 in type 1 
diabetes and 3 in type 2 diabetes. On the basis of prior findings and replication studies thus-far completed, almost all of these 
signals reflect genuine susceptibility effects. We observed association at many previously identified loci, and found 
compelling evidence that some loci confer risk for more than one of the diseases studied. Across all diseases, we identified a 
large number of further signals (including 58 loci with single-point P values between 10 ° and 5 X10 ”) likely to yield 
additional susceptibility loci. The importance of appropriately large samples was confirmed by the modest effect sizes 
observed at most loci identified. This study thus represents a thorough validation of the GWA approach. It has also 
demonstrated that careful use of a shared control group represents a safe and effective approach to GWA analyses of 
multiple disease phenotypes; has generated a genome-wide genotype database for future studies of common diseases in the 
British population; and shown that, provided individuals with non-European ancestry are excluded, the extent of population 
stratification in the British population is generally modest. Our findings offer new avenues for exploring the pathophysiology 


of these important disorders. We anticipate that our data, results and software, which will be widely available to other 
investigators, will provide a powerful resource for human genetics research. 


Despite extensive research efforts for more than a decade, the genetic 
basis of common human diseases remains largely unknown. Although 
there have been some notable successes', linkage and candidate gene 
association studies have often failed to deliver definitive results. Yet 
the identification of the variants, genes and pathways involved in 
particular diseases offers a potential route to new therapies, improved 
diagnosis and better disease prevention. For some time it has been 
hoped that the advent of genome-wide association (GWA) studies 
would provide a successful new tool for unlocking the genetic basis 
of many of these common causes of human morbidity and mortality’. 

Three recent advances mean that GWA studies that are powered to 
detect plausible effect sizes are now possible’. First, the International 
HapMap resource’, which documents patterns of genome-wide vari- 
ation and linkage disequilibrium in four population samples, greatly 
facilitates both the design and analysis of association studies. Second, 
the availability of dense genotyping chips, containing sets of hundreds of 
thousands of single nucleotide polymorphisms (SNPs) that provide 
good coverage of much of the human genome, means that for the first 
time GWA studies for thousands of cases and controls are technically and 
financially feasible. Third, appropriately large and well-characterized 
clinical samples have been assembled for many common diseases. 

The Wellcome Trust Case Control Consortium (WTCCC) was 
formed with a view to exploring the utility, design and analyses of 
GWA studies. It brought together over 50 research groups from the 
UK that are active in researching the genetics of common human 
diseases, with expertise ranging from clinical, through genotyping, to 


informatics and statistical analysis. Here we describe the main experi- 
ment of the consortium: GWA studies of 2,000 cases and 3,000 shared 
controls for 7 complex human diseases of major public health import- 
ance—bipolar disorder (BD), coronary artery disease (CAD), Crohn’s 
disease (CD), hypertension (HT), rheumatoid arthritis (RA), type 1 
diabetes (T1D), and type 2 diabetes (T2D). Two further experiments 
undertaken by the consortium will be reported elsewhere: a GWA 
study for tuberculosis in 1,500 cases and 1,500 controls, sampled from 
The Gambia; and an association study of 1,500 common controls with 
1,000 cases for each of breast cancer, multiple sclerosis, ankylosing 
spondylitis and autoimmune thyroid disease, all typed at around 
15,000 mainly non-synonymous SNPs. By simultaneously studying 
seven diseases with differing aetiologies, we hoped to develop insights, 
not only into the specific genetic contributions to each of the diseases, 
but also into differences in allelic architecture across the diseases. A 
further major aim was to address important methodological issues of 
relevance to all GWA studies, such as quality control, design and ana- 
lysis. In addition to our main association results, we address several of 
these issues below, including the choice of controls for genetic studies, 
the extent of population structure within Great Britain, sample sizes 
necessary to detect genetic effects of varying sizes, and improvements in 
genotype-calling algorithms and analytical methods. 


Samples and experimental analyses 


Individuals included in the study were living within England, 
Scotland and Wales (‘Great Britain’) and the vast majority had 


*Lists of participants and affiliations appear at the end of the paper. 
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self-identified themselves as white Europeans (153 individuals with 
non-Caucasian ancestry were excluded from final analysis—see 
below). The seven conditions selected for study are all common 
familial diseases of major public health importance both in the UK 
and globally’, and for which suitable nationally representative sample 
sets were available. The control individuals came from two sources: 
1,500 individuals from the 1958 British Birth Cohort (58C) and 1,500 
individuals selected from blood donors recruited as part of this pro- 
ject (UK Blood Services (UKBS) controls). See Methods and 
Supplementary Table 1 for sample recruitment, phenotypes and 
summary details for each collection. 

We adopted an experimental design with 2,000 cases for each 
disease and 3,000 combined controls. All 17,000 samples were geno- 
typed with the GeneChip 500K Mapping Array Set (Affymetrix chip), 
which comprises 500,568 SNPs, as described in Methods. The power 
of this study (estimated from simulations that mimic linkage dis- 
equilibrium patterns in the HapMap Caucasian sample (CEU), see 
Methods) averaged across SNPs with minor allele frequencies 
(MAFs) above 5% is estimated to be 43% for alleles with a relative 
risk of 1.3, increasing to 80% for a relative risk of 1.5, for a P-value 
threshold of 5 X 10~’ (Supplementary Table 2). 

We developed a new algorithm, CHIAMO, which we applied to 
simultaneously call the genotypes from all individuals (see Methods 
and Supplementary Information). Cross-platform comparison showed 
CHIAMO to outperform BRLMM (the standard Affymetrix algo- 
rithm) by having an error rate under 0.2% (Supplementary Table 3), 
and comparison of 10° duplicate genotypes in our study gave a dis- 
cordance rate of 0.12%. 

We excluded 809 samples after checks for contamination, false 
identity, non-Caucasian ancestry and relatedness (see Methods and 
Supplementary Table 4); 16,179 individuals remained in the study. 

Genome-wide, 469,557 SNPs (93.8%) passed our quality control 
filters (described in Methods) giving an average call rate of 99.63%. Of 
those, 392,575 have study-wide MAFs > 1% (45,106 have MAFs < 
0.1%; see also Supplementary Figs 1 and 2). Initial analyses of the 
polymorphic SNPs suggest that patterns of linkage disequilibrium 
in our samples are very similar to those in HapMap (Supplementary 
Fig. 3). Therefore, we expect genome coverage with the Affymetrix 
500K set in this study to be similar to that estimated for the HapMap 
CEU panel’. 

All SNPs passing quality control filters were used in the association 
analyses, although power is very low for SNPs with low MAFs (unless 
they have unusually large effects). On visual inspection of the cluster 
plots of SNPs showing apparently strong association, we removed a 
further 638 SNPs with poor clustering. 


Control groups 


Our main purpose in using two control groups was to assess possible 
bias in ascertaining control samples. In addition, noting that DNA 
sample processing differed between these groups, comparison of con- 
trol groups also provides a check for effects of differential genotyping 
errors as a result of differences in DNA collection and preparation. 
Figure 1a shows the results of 1-d.f. Mantel-extension tests” for differ- 
ences in allele frequencies of SNPs between subjects from the 58BC 
and UKBS collections, stratified by 12 broad regions of Great Britain 
(see Supplementary Table 5 and Supplementary Fig. 4 for region 
definitions). The associated quantile-quantile plot (see Methods for 
background) in Fig. 1b shows good agreement with the null distri- 
bution (similar results are obtained for tests that do not stratify by 
geography, data not shown). The fact that we see few significant dif- 
ferences between these two control groups despite the fact that they 
differ in population groups sampled, DNA processing, and age, indi- 
cates that there would be little bias due to use of either sample as a 
control group for any of the case series, and justifies our combining of 
the two control groups to form a single group of 3,000 subjects for our 
main analyses. 
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One consequence of using a shared control group (for which 
detailed phenotyping for all traits of interest is not available) relates 
to the potential for misclassification bias: a proportion of the controls 
is likely to have the disease of interest (and therefore might meet the 
criteria for inclusion as a case) and some others will develop it in 
the future. However, the effect this has on power is modest unless the 
extent of misclassification bias is substantial; for example, if 5% of 
controls would meet the definition of cases at the same age, the loss of 
power is approximately the same as that due to a reduction of the 
sample size by 10%°. Even for the higher prevalence conditions exam- 
ined by the WTCCC (such as HT, CAD and T2D), the precise ascer- 
tainment schemes used here (which enriched for more extreme 
phenotypes and/or strong family history) will have limited the pro- 
portions of controls meeting case criteria to low levels (for example, 
to <5%). Although a study design which used ‘hypercontrols’ (that 
is, selection of control individuals from the lower extremity of the 
relevant trait distribution) would generally be the most powerful 
approach in a study focusing on one disease, the merits of such an 
approach need to be weighed against the additional costs associated 
with the need to phenotype and genotype each control sample. 


Geographical variation and population structure 


An additional cause of false positive findings is hidden population 
structure. Case and control samples may differ in the distribution of 
their ancestry, either owing to control sampling effects, as discussed 
above, or to confounding when different ancestries carry higher dis- 
ease risk and are, as a result, over-represented in cases. Even after 
exclusion of individuals with evidence of recent non-European 
ancestry, the British population is heterogeneous, having been 
shaped by several waves of immigration from southern and northern 
Europe. Whether the differences between these incoming popula- 
tions are sufficiently large to distort the findings of population-based 
case-control studies is an open question. 

We first examined our samples for non-European ancestry, using 
multidimensional scaling after ‘seeding’ our data with those from 
the three HapMap analysis panels (see Supplementary Fig. 5 and 
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Figure 1| Genome-wide scan for allele frequency differences between 
controls. a, P values from the trend test for differences between SNP allele 
frequencies in the two control groups, stratified by geographical region. 
SNPs have been excluded on the basis of failure in a test for Hardy-Weinberg 
equilibrium in either control group considered separately, a low call rate, or 
if minor allele frequency is less than 1%, but not on the basis of a difference 
between control groups. Green dots indicate SNPs with a Pvalue <1 X 10>. 
b, Quantile-quantile plots of these test statistics. In this and subsequent 
quantile-quantile plots, the shaded region is the 95% concentration band 
(see Methods). 
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Figure 2 | Genome-wide picture of geographic variation. a, P values for the 
11-d.f. test for difference in SNP allele frequencies between geographical 
regions, within the 9 collections. SNPs have been excluded using the project 
quality control filters described in Methods. Green dots indicate SNPs with a 
Pvalue <1 X 10°. b, Quantile-quantile plots of these test statistics. SNPs at 
which the test statistic exceeds 100 are represented by triangles at the top of 
the plot, and the shaded region is the 95% concentration band (see 
Methods). Also shown in blue is the quantile-quantile plot resulting from 
removal of all SNPs in the 13 most differentiated regions (Table 1). 


Methods), and excluded 153 individuals on this basis. We next 
looked for evidence of population heterogeneity by studying allele 
frequency differences between the 12 broad geographical regions 
(defined in Supplementary Fig. 4). The results for these 11-d.f. tests 
and associated quantile-quantile plots are shown in Fig. 2. Wide- 
spread small differences in allele frequencies are evident as an 
increased slope of the line (Fig. 2b); in addition, a few loci show much 
larger differences (Fig. 2a and Supplementary Fig. 6). 

Thirteen genomic regions showing strong geographical variation 
are listed in Table 1, and Supplementary Fig. 7 shows the way in which 
their allele frequencies vary geographically. The predominant pattern 
is variation along a NW/SE axis. The most likely cause for these 
marked geographical differences is natural selection, most plausibly 
in populations ancestral to those now in the UK. Variation due to 
selection has previously been implicated at LCT (lactase) and major 
histocompatibility complex (MHC)””, and within-UK differentiation 
at 4p14 has been found independently"’, but others seem to be new 
findings. All but three of the regions contain known genes. Aside from 
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evolutionary interest, genes showing evidence of natural selection are 
particularly interesting for the biology of traits such as infectious dis- 
eases; possible targets for selection include NADSYN1 (NAD synthe- 
tase 1) at 11q13, which could have a role in prevention of pellagra, as 
well as TLRI (toll-like receptor 1) at 4p14, for which a role in the 
biology of tuberculosis and leprosy has been suggested"®. 

There may be important population structure that is not well 
captured by current geographical region of residence. Present 
implementations of strongly model-based approaches such as 
STRUCTURE!’ are impracticable for data sets of this size, and we 
reverted to the classical method of principal components'*"*, using a 
subset of 197,175 SNPs chosen to reduce inter-locus linkage disequi- 
librium. Nevertheless, four of the first six principal components 
clearly picked up effects attributable to local linkage disequilibrium 
rather than genome-wide structure. The remaining two components 
show the same predominant geographical trend from NW to SE but, 
perhaps unsurprisingly, London is set somewhat apart (Supplemen- 
tary Fig. 8). 

The overall effect of population structure on our association 
results seems to be small, once recent migrants from outside 
Europe are excluded. Estimates of over-dispersion of the association 
trend test statistics (usually denoted J; ref. 15) ranged from 1.03 and 
1.05 for RA and T1D, respectively, to 1.08—1.11 for the remaining 
diseases. Some of this over-dispersion could be due to factors other 
than structure, and this possibility is supported by the fact that inclu- 
sion of the two ancestry informative principal components as cov- 
ariates in the association tests reduced the over-dispersion estimates 
only slightly (Supplementary Table 6), as did stratification by geo- 
graphical region. This impression is confirmed on noting that 
Pvyalues with and without correction for structure are similar 
(Supplementary Fig. 9). We conclude that, for most of the genome, 
population structure has at most a small confounding effect in our 
study, and as a consequence the analyses reported below do not 
correct for structure. In principle, apparent associations in the few 
genomic regions identified in Table 1 as showing strong geographical 
differentiation should be interpreted with caution, but none arose in 
our analyses. 


Disease association results 


We assessed evidence for association in several ways (see Methods for 
details), drawing on both classical and bayesian statistical approaches. 
For polymorphic SNPs on the Affymetrix chip, we performed trend 
tests (1 degree of freedom'®) and general genotype tests (2 degrees of 
freedom", referred to as genotypic) between each case collection and 
the pooled controls, and calculated analogous Bayes factors. There 
are examples from animal models where genetic effects act differently 
in males and females’’, and to assess this in our data we applied a 


Chromosome Genes Region (Mb) SNP Position P value 
2q21 LCT 135.16-136.82 rs1042712 136,379,576 5.54 x 10°38 
4p14 TLR1, TLR6, TLR10 38.51-38.74 rs7696175 386,43,552 1.51 x 10°17 
4q28 137.97-138.01 rs1460133 137,999,953 4.43 x 10°% 
6p25 IRF4 0.32-0.42 rs9378805 362,727 539x108 
6p2] HLA 31.10-31.55 183873375 31,359,339 1.07 x 10°71 
9p24 DMRT1 0.86-0.88 rs11790408 866,418 4.96 x 10°°” 
11p15 NAV2 19.55-19.70 rs12295525 19,661,808 744 x 10°°% 
igi NADSYN1, DHCR7 70.78-70.93 rs12797951 70,820,914 3.01 x 10° 
12p13 DYRK4,AKAP3,NDUFAQ, 4.37-4.82 rs10774241 45,537,27 2.73 Xx 10% 
RADS1AP1,GALNT8 
14q12 HECTD1,AP4S1,STRN3 30.41-31.03 rs17449560 30,598,823 1.46 x 10°°” 
19q13 GIPR,SNRPD2,QPCTL, 50.84-51.09 rs3760843 50,980,546 4.19 x 10°°” 
SIX5,DMPK,DMWD, 
RSHL1,SYMPK,FOXA3 
20q12 38.30-38.77 rs2143877 38,526,309 1a <0 
Xp22 2.06-2.08 rs6644913 2,061,160 1.23 x 10-°” 
Properties of SNPs that show large allele frequency differences between samples of individuals from 12 regions across Great Britain. Regions showing differentiated SNPs are given with details of the 
SNP with the smallest P value in each region for differentiation on the 11-d.f. test of differences in SNP allele frequencies between geographical regions, within the 9 collections. Cluster plots for these 


SNPs have been examined visually. Signal plots appear in Supplementary Information. Positions are in NCBI build-35 coordinates. 
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Box 1| Significance levels in genome-wide studies 


There has been much debate concerning interpretation of significance 
levels in genome-wide association studies and whether, and how, these 
should be corrected for multiple testing. Classical multiple testing 
theory in statistics is concerned with the problem of ‘multiple tests’ of a 
single ‘global’ null hypothesis. This, we would argue, is a problem far 
removed from that which faces us in genome-wide association studies, 
where we face the problem of testing ‘multiple hypotheses’ (for a 
particular disease, one hypothesis for each SNP, or region of correlated 
SNPs, in the genome) and we thus do not subscribe to the view that one 
should correct significance levels for the number of tests performed to 
obtain ‘genome-wide significance levels’. Nonetheless, our aim is to 
keep the false positive rate within acceptable bounds and this still leads 
to the view that very low P values are needed for strong evidence of 
association. But the factor determining the threshold is not the number 
of tests performed, but the a priori probability that there is likely to bea 
true association at any specified location in the genome. Of course, we 
cannot know this prior probability from objective evidence, but we can 
perhaps estimate an order of magnitude. 

There are two linked questions. The first concerns the choice of an 
appropriate ‘threshold’ for reporting possible associations as likely to 
be genuine. Here the mathematics is quite straightforward if we make 
the simplifying assumption that we have the same power to detect all 
true associations. Then we have'® 

Posterior odds for true association = 

Prior odds X Power/Significance threshold 
That is, for a given significance threshold, the probability of a true 
association depends on the prior odds and, crucially, the power. A 
plausible estimate for the prior odds of true association at any specified 
locus might be of the order of 100,000:1 against, for example, on the 
basis of 1,000,000 ‘independent’ regions of the genome and an 
expectation of 10 detectable genes involved in the condition. (Other 
plausible estimates might vary from this by an order of magnitude or so 
in either direction.) Then, assuming a power of 0.5 and a significance 
threshold of 5 X 10” ”, the posterior odds in favour of a ‘hit’ being a true 
association would be 10:1. However, if we relax this significance 
threshold by a factor of ten, or alternatively if the power were lower by 
a factor of 10, the posterior odds that a ‘hit’ is a true association would 
also be reduced by a factor of ten. This simple mathematical analysis is 
little affected by allowing for the fact that true associations come in 
various sizes with varying power to detect them; the above formula is 
simply modified by interpreting ‘power’ as the mean power. 

The above discussion concerns ‘average’ properties of ‘hits’ 
achieving given significance levels. After the association data are 
available, a related but different question is whether a particular 
positive finding is likely to be a true one. For that calculation, the prior 
odds must be multiplied by the Bayes factor, the ratio of the probability 
of the observed data under the assumption that there is a true 
association to its probability under the null hypothesis. As in power 
calculations, the calculation of Bayes factors requires assumptions 
about effect sizes (see Methods for details). 

A key point from both perspectives is that interpreting the strength 
of evidence in an association study depends on the likely number of 
true associations, and the power to detect them which, in turn, 
depends on effect sizes and sample size. In a less-well-powered study 
it would be necessary to adopt more stringent thresholds to control the 
false-positive rate. Thus, when comparing two studies for a particular 
disease, with a hit with the same MAF and P value for association, the 
likelihood that this is a true positive will in general be greater for the 
study that is better powered, typically the larger study. In practice, 
smaller studies often employ less stringent P-value thresholds, which is 
precisely the opposite of what should occur. 


sex-differentiated test which is sensitive to associations of a different 
magnitude and/or direction in the two sexes. 

Our study also allows us to look for loci which may have an effect in 
more than one disease. To assess this, we compared our common 
controls with all cases in each of three natural groupings of diseases: 
CAD+HT+T2D (metabolic and cardiovascular phenotypes with 
potential aetiological overlap, for example, involving defects in insu- 
lin action); RA+T1D (already known to share common loci); and 
CD+RA+TID (all autoimmune diseases). 
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To help to capture putative disease loci not on the Affymetrix chip 
we used a new multilocus method in which a population genetics 
model is applied to our genotype data and the HapMap reference 
samples to simulate, or impute, genotype data at 2,193,483 HapMap 
SNPs not on the Affymetrix chip. These imputed, or in silico, geno- 
types are then tested for association in the same ways as SNPs geno- 
typed in the project. 

Before detailing the principal results for each disease, we first sum- 
marize our main observations. Table 2 details the findings from the 
WTCCC scan for the 15 variants for which there was strong prior 
evidence of association with one or more of the diseases studied, 
based on extensive replication studies. All but two of these show 
associations in our study, with the magnitude of the evidence gen- 
erally consistent with their effect sizes as estimated from prior studies. 
One of the signals for which we failed to obtain evidence of replica- 
tion (APOE in CAD) is poorly tagged by the Affymetrix 500K chip. 
The other (INSin T1D) is represented by a single SNP that marginally 
failed our study-wide quality control filters (overall missingness 
5.2%) but which was nonetheless strongly associated with T1D when 
examined. Quantile-quantile plots for the trend test for each of the 
seven diseases show only very minor deviations from the null distri- 
bution, except in the extreme tails which correspond to associations 
reported below (Fig. 3). The quantile-quantile plots and the results at 
positive controls (Table 2) give confidence in the quality of our data 
and the robustness of our analyses. 

Our genome-wide results for the trend test are illustrated in Fig. 4. 
The single-disease trend and genotypic tests for SNPs on the chip 
identified 21 signals across the 7 diseases that exceeded a threshold of 
5X10 ” (Table 3). For each of these SNPs (except those within the 
MHC), cluster plots are shown in Supplementary Fig. 10 and ‘signal 
plots’ in Fig. 5. These signal plots estimate the likely demarcation of 
the hit region and show the signal at genotyped and imputed SNPs 
together with local genomic context. Four further strong (with 
P<5 X10”) associations were revealed by the other primary ana- 
lyses described (Table 3). One locus (in RA) was revealed by the sex- 
differentiated analysis, two through multilocus approaches (both for 
T1D) and one through an analysis which combined cases from more 
than one autoimmune disease (signal plots in Supplementary Figs 11, 
12 and 13, respectively). 

All of these signals were subjected to visual inspection of cluster 
plots, and in all cases (with one exception noted below) nearby corre- 
lated SNPs also showed a strong signal (see signal plots). Thus, geno- 
typing artefacts are unlikely to be responsible for these associations. 
Indeed, at the time of writing, 12 of these 25 strong signals represent 
replications of previously reported findings (only those with extensive 
prior replication are reported in Table 2). Of the remainder, follow-up 
studies (reported elsewhere) have confirmed all but one of the loci (ten 
in total) for which replication has been attempted'®'’*. The other 
replication study gave equivocal results. Of the 18 loci implicated in 
autoimmune diseases, 5 show associations (P< 0.001) to more than 1 
condition, leading to a number of further potential new associations, 
at least one of which has also been replicated”. 

It is likely that further susceptibility genes will be identified through 
follow-up of other signals for which the evidence from our scan is less 
conclusive (see below for some specific examples). For example, there 
are 58 further signals with single-point P values between 10° and 
5 X 10” for which inspection of cluster plots verifies CHIAMO calls 
(Table 4). As described below, analyses which make use of selected case 
samples to expand the reference group should also provide a useful 
route to the prioritization of such putative signals for further analysis. 
For convenience, the strongest association results are presented sepa- 
rately for each disease in Supplementary Table 7. 

Several general points are relevant to interpretation of these dis- 
ease-association data. First, replication studies are required to con- 
firm associations from GWAs. For the reasons given in the box, we 
regard very low P values (say P<5 X 10”) in our comparatively 
large sample size as strong evidence for association, and indeed all 
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Table 2 | Evidence for signal of association at previously robustly replicated loci 


Collection Gene Chromosome Reported SNP WTCCC SNP HapMap r2 Trend Genotypic P value 
P value 

CAD APOE 19q13 % rs4420638 - 1.7 x 10° 1.7 x 10°° 
cD NOD2 16q12 rs2066844 rs17221417 0.23 94x 10° 40x 10712 
CD IL23R 1p31 rs11209026 rs11805303 0.01 65x10. 59x10" 
RA HLA-DRB1 6p21 * rs615672 = 2.6 x 10-2” 75 x 10°27 
RA PTPN22 1p13 rs2476601 rs6679677 0.75 4.9 x 10°76 56x 107° 
T1D HLA-DRB1 6p21 * rs9270986 - 40x10 176 2.3 x 10°17? 
T1D INS 11p15 rs689 ; - - - 

T1D CTLA4 2933 rs3087243 rs3087243 1 2.5 x 10°°° 18 10-2 
T1D PTPN22 1p13 rs2476601 rs6679677 0.75 12.1077" 54x 10°76 
T1D IL2RA 10p15 rs706778 rs2104286 0.25 8.0 x 10°° A3 x40 ° 
TID IFIH1 2q24 rs1990760 rs3788964 0.26 1.9 x 10° 76 xX 10°° 
T2D PPARG 3p25 rs1801282 rs1801282 1 13x 10° 54x10 % 
T2D KCNJ11 11p15 rs5219 rs5215 0.9 1.33810" ° 56x10 ° 
T2D TCF7L2 10q25 rs7903146 rs4506565 0.92 57x10 51x10" 


Where information on the strength of association at a particular SNP had been previously published and replicated we tabulated the P value of both the trend and genotype test at the same SNP Cif in 
our study), or the best tag SNP (defined to be the SNP with highest r? with the reported SNP, calculated in the CEU sample of the HapMap project). Positions are in NCBI build-35 coordinates. 
*Previous reports relate to haplotypes rather than single SNPs. "Not well tagged by SNPs that pass the quality control, see main text. 


or most of the loci we find at this level are either already known or 
have now been confirmed by subsequent replication. Such replica- 
tion studies are also the substrate for efforts to determine the range of 
associated phenotypes and to identify and characterize pathologically 
relevant variation. 

Second, failure to detect a prominent association signal in the pre- 
sent study cannot provide conclusive exclusion of any given gene. This 
is the consequence of several factors including: less-than-complete 
coverage of common variation genome-wide on the Affymetrix chip; 
poor coverage (by design) of rare variants, including many structural 
variants (thereby reducing power to detect rare, penetrant, alleles)”; 
difficulties with defining the full genomic extent of the gene of interest; 
and, despite the sample size, relatively low power to detect, at levels of 


significance appropriate for genome-wide analysis, variants with 
modest effect sizes (odds ratio (OR) < 1.2). 

Third, whereas the association signals detected can help to define 
regions of interest, they cannot provide unambiguous identification 
of the causal genes. Nevertheless, assessments on the basis of posi- 
tional candidacy carry considerable weight, and, as we show, these 
already allow us, for selected diseases, to highlight pathways and 
mechanisms of particular interest. Naturally, extensive resequencing 
and fine-mapping work, followed by functional studies will be 
required before such inferences can be translated into robust state- 
ments about the molecular and physiological mechanisms involved. 

We turn now to a discussion of the main findings for each disease, 
focusing here only on the most significant and interesting results 
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Figure 3 | Quantile-quantile plots for seven genome-wide scans. For each 
of the seven disease collections, a quantile-quantile plot of the results of the 
trend test is shown in black for all SNPs that pass the standard project filters, 
have a minor allele frequency >1% and missing data rate <1%. SNPs that 
were visually inspected and revealed genotype calling problems were 

excluded. These filters were chosen to minimize the influence of genotype- 
calling artefacts. Each quantile-quantile plot shown in black involves around 


360,000 SNPs. SNPs at which the test statistic exceeds 30 are represented by 
triangles. Additional quantile-quantile plots, which also exclude all SNPs 
located in the regions of association listed in Table 3, are superimposed in 
blue (for BD, the exclusion of these SNPs has no visible effect on the plot, and 
for HT there are no such SNPs). The blue quantile-quantile plots show that 
departures in the extreme tail of the distribution of test statistics are due to 
regions with a strong signal for association. 
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from the analyses described above, and consideration of an expanded 
reference group, described below. 

Bipolar disorder (BD). Bipolar disorder (BD; manic depressive ill- 
ness*°) refers to an episodic recurrent pathological disturbance in 
mood (affect) ranging from extreme elation or mania to severe depres- 
sion and usually accompanied by disturbances in thinking and beha- 
viour: psychotic features (delusions and hallucinations) often occur. 
Pathogenesis is poorly understood but there is robust evidence for a 
substantial genetic contribution to risk’”**. The estimated sibling 
recurrence risk (A,) is 7-10 and heritability 80-90%*”*. The definition 
of BD phenotype is based solely on clinical features because, as yet, 
psychiatry lacks validating diagnostic tests such as those available for 
many physical illnesses. Indeed, a major goal of molecular genetics 
approaches to psychiatric illness is an improvement in diagnostic 
classification that will follow identification of the biological systems 
that underpin the clinical syndromes. The phenotype definition that 
we have used includes individuals that have suffered one or more 
episodes of pathologically elevated mood (see Methods), a criterion 
that captures the clinical spectrum of bipolar mood variation that 
shows familial aggregation”. 

Several genomic regions have been implicated in linkage studies*° 
and, recently, replicated evidence implicating specific genes has been 
reported. Increasing evidence suggests an overlap in genetic suscept- 
ibility with schizophrenia, a psychotic disorder with many similar- 
ities to BD. In particular association findings have been reported with 
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both disorders at DAOA (D-amino acid oxidase activator), DISC1 
(disrupted in schizophrenia 1), NRGI (neuregulinl) and DTNBP1 
(dystrobrevin binding protein 1)*’. 

The strongest signal in BD was with rs420259 at chromosome 
16p12 (genotypic test P= 6.3 X 10°; Table 3) and the best-fitting 
genetic model was recessive (Supplementary Table 8). Although 
recognizing that this signal was not additionally supported by the 
expanded reference group analysis (see below and Supplementary 
Table 9) and that independent replication is essential, we note that 
several genes at this locus could have pathological relevance to BD, 
(Fig. 5). These include PALB2 (partner and localizer of BRCA2), 
which is involved in stability of key nuclear structures including 
chromatin and the nuclear matrix; NDUFABI1 (NADH dehydrogen- 
ase (ubiquinone) 1, alpha/beta subcomplex, 1), which encodes a 
subunit of complex I of the mitochondrial respiratory chain; and 
DCTNS (dynactin 5), which encodes a protein involved in intracel- 
lular transport that is known to interact with the gene ‘disrupted in 
schizophrenia 1’ (DISC1)”, the latter having been implicated in sus- 
ceptibility to bipolar disorder as well as schizophrenia’. 

Of the four regions showing association at P<5 X10’ in the 
expanded reference group analysis (Supplementary Table 9), it is of 
interest that the closest gene to the signal at rs1526805 (P= 2.2 X 
10°”) is KCNC2 which encodes the Shaw-related voltage-gated pot- 
assium channel. Ion channelopathies are well-recognized as causes of 
episodic central nervous system disease, including seizures, ataxias 
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Figure 4 | Genome-wide scan for seven diseases. For each of seven diseases 
—log)o of the trend test P value for quality-control-positive SNPs, excluding 
those in each disease that were excluded for having poor clustering after 
visual inspection, are plotted against position on each chromosome. 
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Chromosomes are shown in alternating colours for clarity, with 

Pvalues <1 X 107° highlighted in green. All panels are truncated at 
—logio(P value) = 15, although some markers (for example, in the MHC in 
TID and RA) exceed this significance threshold. 
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5 v Region (Mb) SNP eS 2 = a or ew 2 2 ed a 
E : B8 S$, &¢ 83 22 Be Re 2 § 
5 E Bes 58 SS gf B 8 a es g a 
8 6 Sa 2B 2&8 x § 28 53 5 ) 
s) = rs I’ S 
Standard analysis 
BD 16p12 = 23.3-23.62 rs420259 219x10°% 629x10°% 196 479 A G 2.08(1.60-2.71) 2.07 (1.6-2.69) 0.282 0.248 
CAD 9p21 = 21.93-22.12 __rs1333049 179x104 116x10°% 1166 1119 C C 147(1.27-1.70) 1.9 (1.61-2.24) 0.474 0.554 
CD 1p31 67.3-67.48 s11805303 645x10% 585x10°% 1007 941 T T 1.39(1.22-1.58) 1.86 (1.54-2.24) 0.317 0.391 
CD 2q37 =. 233.92-234._ rs10210302. 7.10 x10 526x107 1111 11.28 T C 1.19(1.01-1.41) 1.85(1.56-2.21) 0.481 0.402 
CD 3p21 49.3-49.87 rs9858542 771 x10°° 358x10°% 424 522 A A _ 1.09(0,96-1.24) 184 (1.49-2.26) 0.282 0.331 
cD 5p13 40.32-40.66 = rs17234657. «2.13 x10 8 199x100 1041 989 G G 1.54(1.34-1.76) 2.32 (1.59-3.39) 0125 0.181 
CD 5q33 150.15-150.31 rs1000113 510 x10°°% 315x109 536 501 T T 1.54(1.31-1.82) 1.92(0.92-4.00) 0.067 0.098 
CD 10q21 64.06-64.31 1s10761659 268x10° 175x10°% 469 413 G A = 1.23(1.05-1.45) 1.55 (1.3-1.84) 0.461 0.406 
CD 10q24 101.26-101.32 rs10883365 14110 °% 582x10% 591 548 G G = 1.2(1.03-1.39) 1.62(1.37-1.92) 0477 0.537 
CD 16q12 49.02-49.4  rs17221417. 93610 398x101? 893 847 G G 1.29(1.13-146) 1.92 (1.58-2.34) 0.287 0.356 
CD 18p11 2.76-12.91 rs2542151 456x10°°% 203x10° 542 500 G G 1.3(1.14-1.48) 2.01 (1.46-2.76) 0.163 0.208 
RA 1p13 3.54-114.16 1s6679677 490x107 555x107 2236 2199 A A _ 1.98(1.72-2.27) 3.32 (1.93-5.69) 0.096 0.168 
RA 6 MHC rs6457617* -3.44x 107° 518x10°” 7484 7318 T T 2.36(1.97-2.84) 5.21(4.31-6.30) 0.489 0.685 
T1D 1p13 3.54-114.16 1s6679677 117 x 10° 543x107 23.07 2283 A A 1.82(1.59-2.09) 5.19 (3.15-8.55) 0.096 0.169 
T1D 6 MHC rs9272346* 242x107 5.47x 10784 1419 142.2 A G_ 5.49 (4.83-6.24) 18.52 (27.03-12.69) 0.387 0.150 
TID 2q13 54.64-55.09_+s11171739. «114x107 9.71x10" 889 824 C C 1.34(1.17-1.54) 1.75 (1.48-2.06) 0.423 0.493 
T1D 2q24 109.82-111.49 1s17696736 217x107 151x100“ 1253 1188 G G 1.34(1.16-1.53) 1.94(1.65-2.29) 0.424 0.506 
T1D 6p13 10.93-11.37. _+s12708716 9.24x10°°% 492x10°° 515 470 A G 1.19(0.97-1.45) 1.55 (1.27-1.89) 0.350 0.297 
T2D 6p22 = 20.63-20.84 —_rs9465871 1.02 x10°° 3.34x10°° 415 398 C C 1.18(1.04-1.34) 2.17 (1.6-2.95) 0.178 0.218 
T2D 0q25 4.71-114.81  rs4506565 568x10% 505x10 *% 1014 943 T T 1.36(4.2-1.54) 1.88 (1.56-2.27) 0.324 0.395 
T2D 6q12 52.36-52.41 —_rs9939609 524x10°°% 191x107 535 505 A A 1.34(1.17-1.52) 1.55 (1.3-1.84) 0.398 0.453 
Multi-locus analysis 
T1D 4q27 123.26-123.92 1s6534347 448x10°7 183x10°°% 515 469 A A 1.30(1.10-1.55) 1.49 (1.25-1.78) 0.351 0.402 
T1D 2p13 9.71-9.86 rs3764021 719x10°° 508x10°% 212 455 C T 1.57(1.38-1.79) 148(1.25-1.75) 0.467 0.426 
Sex differentiated analysis 
RA 7q32 130.80-130.84 r1s11761231 391%x10°? 137x10° - - G A 144(1.19-1.75) 164(1.35-1.99) 0.375 0.327 
Combined cases 

RA+T1D 10p15 6.07-6.17 rs2104286 5.92x10°°% 252x10° 526 445 T C 135(1.11-1.65) 162 (1.34-1.97) 0.286 0.245 
Regions with at least one SNP with a P value of less than 5 x10” for our primary analyses. The logio value of the Bayes factor (BF) for the bayesian analysis corresponding to the trend and genotypic 


tests is also given. Region marks the boundaries of signal defined by recombination and return of test statistics to background levels. The minor allele is defined in the controls and its frequency in that 
group as well as the case sample is reported. MAF, minor allele frequency. Cluster plots for each SNP have been inspected visually, and are shown in Supplementary Fig. 10. Positions are in NCBI build- 


35 coordinates *Multiple SNPs in the MHC region are significant, we report the most extreme. 


and paralyses**. It is possible that this may extend to episodic distur- 
bances of mood and behaviour. 

Amongst the other higher ranked signals in the BD data set 
(Supplementary Table 7), there is support for the previously suggested 
importance of GABA neurotransmission (1s7680321 (P= 6.2 X 10 °) 
in GABRB1 encoding a ligand-gated ion channel (GABA A receptor, 
beta 1))*5, glutamate neurotransmission (rs1485171 (P = 9.7 X 10°”) 
in GRM7 (glutamate receptor, metabotropic 7))*° and synaptic func- 
tion (1811089599 (P= 7.2 X 107°) in SYN3 (synapsin III)**). 

We note that a broad range of genetic and non-genetic data point 

to the importance of analyses that use alternative approaches to 
phenotype definition, including symptom dimensions*'. Although 
beyond the scope of the current paper, such analyses will be required 
to maximize the potential of the current BD data set. 
Coronary artery disease (CAD). Coronary artery disease (coronary 
atherosclerosis) is a chronic degenerative condition in which lipid 
and fibrous matrix is deposited in the walls of the coronary arteries to 
form atheromatous plaques”. It may be clinically silent or present 
with angina pectoris or acute myocardial infarction. Pathogenesis is 
complex, with endothelial dysfunction, oxidative stress and inflam- 
mation contributing to development and instability of the athero- 
sclerotic plaque’’. 

In addition to lifestyle and environmental factors, genes are 
important in the aetiology of CAD”. For early myocardial infarction, 
estimates of 7, range from ~2 to ~7 (ref. 39). Genetic variation is 
thought likely to influence risk of CAD both directly and through 
effects on known CAD risk factors including hypertension, diabetes 
and hypercholesterolaemia. Genome-wide linkage studies have 
mapped several loci that may affect susceptibility to CAD/myocardial 
infarction*® although for only two of these has the likely gene been 
identified (ALOXS5AP (arachidonate 5-lipoxygenase-activating pro- 
tein) and LTA4H (leukotriene A4 hydrolase) )*’. Association stud- 
ies have identified several plausible genetic variants affecting lipids, 


thrombosis, inflammation or vascular biology but for most the evid- 
ence is not yet conclusive*®. We did not find evidence for strong 
association at any of these genes within our study (Table 2 and 
Supplementary Table 10). 

The most notable new finding for CAD is the powerful association 
on chromosome 9p21.3 (Table 3; Fig. 5). Although the strongest 
signal is seen at rs1333049 (P= 1.8 X 10 '4), associations are seen 
for SNPs across > 100 kilobases. This region has not been highlighted 
in previous studies of CAD or myocardial infarction***’. The region 
of interest contains the coding sequences of genes for two cyclin 
dependent kinase inhibitors, CDKN2A (encoding p16'N“**) and 
CDKN2B (p15'N**»), although the most closely associated SNP is 
some distance removed. Both genes have multiple isoforms, have 
an important role in the regulation of the cell cycle and are widely 
expressed“, with CDKN2B known to be expressed in the macro- 
phages but not the smooth muscle cells of fibrofatty lesions****. It 
is of interest that expression of CDKN2B is induced by transforming 
growth factor beta (TGF-B) and that the TGF-B signalling system is 
implicated in the pathogenesis of human atherosclerosis****. Besides 
CDKN2A and CDKN2B, the only other known gene nearby is MTAP 
which encodes methylthioadenosine phosphorylase, an enzyme that 
contributes to polyamine metabolism and is important for the 
salvage of both adenine and methionine. MTAP is ubiquitously 
expressed, including in the cardiovascular system*’”. Further work is 
required to determine whether the CAD association at this locus is 
mediated through CDKN2A/B, MTAP or some other mechanism. 
The same region also shows replicated evidence of association to 
T2D in the WTCCC and other data sets’’*'”’, though different 
SNPs seem to be involved. 

None of the loci showing more modest associations with CAD 
(Table 4) includes genes hitherto strongly implicated in the patho- 
genesis of CAD. A potentially interesting association is at rs6922269 
(P= 6.3 X 10 °), an intronic SNP in MTHFDIL, which encodes 
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methylenetetrahydrofolate dehydrogenase (NADP -dependent) 
1-like, the mitochondrial isozyme of Cl-tetrahydrofolate (THF) 
synthase***°. C,-THF synthases interconvert the one carbon units car- 
ried by the biologically active form of folic acid, C1-tetrahydrofolate. 
These are used in a variety of cellular processes including purine and 
methionine synthesis**. Another enzyme in the same pathway, methyl- 
ene THF reductase (encoded by MTHFR) is subject to a common 
mutation which influences plasma homocysteine level” and has been 
associated with increased risk of coronary and other atherosclerotic 
disease*'. The possibility of a link between variants in MTHFDIL and 
CAD risk is supported by evidence that MTHFDIL activity also con- 
tributes to plasma homocysteine” and that defects in the MTHFD1L 
pathway may increase plasma homocysteine level**”’. 


BD hit region, chromosome 16 


CAD hit region, chromosome 9 
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An intronic SNP in ADAMTS]7 (a disintegrin and metalloprotei- 
nase with thrombospondin motifs 17), which showed modest asso- 
ciation (rs1994016; P= 1.1 X 107“) in our primary analysis, showed 
a much stronger association in the expanded reference group analysis 
(see below and Supplementary Table 9). Although the specific func- 
tion of ADAMTS17 has not been determined, other members of 
the ADAMTS family have been implicated in vascular extracellular 
matrix degradation, vascular remodelling and atherosclerosis™”». 
Crohn’s disease (CD). Crohn’s disease is a common form of chronic 
inflammatory bowel disease*®. The pathogenic mechanisms are poorly 
understood, but probably involve a dysregulated immune response 
to commensal intestinal bacteria and possibly defects in mucosal 
barrier function or bacterial clearance’. Genetic predisposition to 
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Figure 5 | Regions of the genome showing strong evidence of association. Characteristics of genomic regions 1.25 Mb to either side of ‘hit SNPs’-—SNPs with 
lowest P values. Region boundaries (vertical dotted lines) were chosen to coincide with locations where test statistics returned to background levels and, where 
possible, recombination hotspots. Upper panel, —1log;o(P values) for the test (trend or genotypic) with the smallest P value at the hit SNP. Black points represent 
SNPs typed in the study, and grey points represent SNPs whose genotypes were imputed. SNPs imputed with higher confidence are shown in darker grey. Middle 
panel, fine-scale recombination rate (centimorgans per Mb) estimated from Phase II HapMap. The purple line shows the cumulative genetic distance (in cM) 
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CD is suggested by a A, of 17-35 and by twin studies that contrast 
monozygotic concordance rates of 50% with only 10% in dizygotic 
pairs”. 

A number of CD-susceptibility loci have previously been defined, 
and all of these generate strong signals in our data (Table 2). In 2001, 
positional cloning identified CARD15 (caspase recruitment domain 
family, member 15; NOD2) as the first confirmed CD-susceptibility 
gene®*'. In the present study, this locus is represented by rs17221417 
(P= 9.4 X 10"). A second association, on chromosome 5q31 (ref. 
62) has been widely replicated, although the identity of the causative 
gene is disputed owing to extensive regional linkage disequilibrium®. 
Here, the previously described risk haplotype is tagged by rs6596075 
(P=5.4X 1077). 


T1D hit region, chromosome 1 


T1D hit region, chromosome 12 
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More recent studies have identified four further CD-susceptibility 
loci, all of which are strongly replicated in the present study. 
The association between CD and SNPs within IL23R (interleukin 
23 receptor)” is here represented by a cluster of associated SNPs, 
including 1s11805303 (P=6.5 X10 '*). The strongest signal for 
CD in the present scan (at rs10210302; P= 7.1 X 107 '*) maps to 
the ATGI16L1 (ATG16 autophagy related 16-like 1) gene and is in 
strong linkage disequilibrium (1° = 0.97) with a non-synonymous 
SNP (T300A, rs2241880) associated with CD in a German non- 
synonymous SNP scan. The third is a locus at chromosome 
10q21 around 1s10761659 (P= 2.7 X 10~”) and represents a non- 
coding intergenic SNP mapping 14-kb telomeric to gene ZNF365 
and 55-kb centromeric to the pseudogene antiquitin-like 4—a 


T1D hit region, chromosome 12 
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from the hit SNP. Lower panel, known genes, and sequence conservation in 17 vertebrates. Known genes (orange) in the hit region are listed in the upper right 
part of each plot in chromosomal order, starting at the left edge of the region. The top track shows plus-strand genes and the middle track shows minus-strand 
genes. Sequence conservation (bottom track) scores are based on the phylogenetic hidden Markov model phastCons. Highly conserved regions (phastCons 

score =600) are shown in blue. Information in middle and lower panels is taken from the UCSC Genome Browser. Positions are in NCBI build-35 coordinates. 


See Supplementary Information on ‘signal plots’. 
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Table 4 | Regions of the genome showing moderate evidence of association 
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s x Region (Mb SNP 7 a = = a, 2 2 2 & id 
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BD 2p25. —-:11.94-12.00 rs4027132. -1.31x10°° 968x10°°% 307 284 A G~ 1.39(1.19-1.64) 1.51(1.27-1.79) 0.459 0.414 
BD 2q12 = 104.41-104.58  rs7570682 3.11 X 10° 164x10°° 368 323 A A 1.23(1.09-1.40) 1.4 (1.28-212) 0.214 0.255 
BD 2q14 115.63-116.11 +1s1375144. 243x10°°% 131x10°° 380 292 A G .32 (1.07-1.63) 1.59 (1.29-1.96) 0.337. 0.291 
BD 2q37 ==. 241.23-241.28  1s2953145 1.11 x10°° 657x10°°%° 322 350 C G = 184(1.31-2.58) 2.14 (1.53-2.98) 0.226 0.189 
BD 3p23- 32.26-32.33 rs4276227.  4.57X 10° 262x10°° 352 304 C T  1.20(0.99-1.46) 1.49 (1.23-181) 0.371 0.326 
BD 3q27. ~—-: 184.29-184.40 _ rs683395 230x10°° 511x10°°% 387 3.73 G G_~ 147 (1.26-1.71) 1.30(0.69-2.46) 0.080 0.109 
BD 6p21  42.82-42.86 rs6458307. 3.43 x10° 435x10°°% -080 284 T T 0.84 (0.75-0.96) 1.39 (1.13-169) 0.312 0.321 
BD 8p12 34.22-34.61 rs2609653. 6.86 X 10 °° : 344 3.21 C C 143(1.19-1.71) 3.62 (1.26-10.44) 0.052 0.074 
BD 9q32 = 114.31-114.39  1s10982256 88010 °° 441x10°° 323 237 T C  1.26(1.08-1.47) 1.47 (1.24-1.74) 0.471 0.425 
BD 4q22 57.17-57.24 rs10134944 3.21x10°° 689x10°°% 373 359 T T 145 (1.24-168) 1.32 (0.74-233) 0.086 0.115 
BD 4q32 103.43-103.62 +s11622475 210x10°% 814x10°°% 387 324 C T 1.13(0.89-1.44) 1.47(1.17-1.86) 0.300 0.256 
BD 6q12 51.36-51.50 rs1344484. 164x10°°% 1.03x10°° 394 341 T C 1.24 (1.03-148) 1.52 (1.27-182) 0.402 0.353 
BD 20p13--3.70-3.73 rs3761218.  443x10°° 671x10°°%° 258 318 T C  0.97(0.81-1.15) 1.31(1.09-157) 0.397 0.356 
CAD 1043 236.77-236.85 rs17672135 1.04x10°% 235x10°% 236 388 T C  0.70(0.61-0.81) 1.32 (0.79-2.22) 0.134 0.108 
CAD 5q21 99.98-100.11 _rs383830 5.72x10°°% 134x10°° 349 3.26 T A 160(1.16-2.21) 1.92 (1.40-2.63) 0.220 0.182 
CAD 6q25  =151.34-151.42  1s6922269 6.33 10° 150x10° 338 314 A A 1.17 (1.04-1.32) 1.65 (1.32-2.06) 0.253 0.294 
CAD 16q23  81.72-81.79 rs8055236 9.73 X10°°% 560x10°°% 328 359 G T 91 (1.33-2.74) 2.23 (1.56-3.17) 0.198 0.162 
CAD 19q12 34.74-34.78 rs7250581 9.12x10°° 250x10° 330 287 G A 1.06 (0.79-1.43) 1.40 (1.05-186) 0.220 0.182 
CAD 22q12 25.01-25.06 rs688034 6.90x10°°% 375x10°°% 333 315 T T 111(0.98-1.25) 1.62 (1.34-1.95) 0.310 0.355 
CD q24 169.53-169.67  rs12037606 1.79x10°°° 109x10°° 389 3.35 A A 1.22(1.07-1.40) 1.52 (1.28-1.82) 0.388 0.438 
cD 5q23 =: 131.40-131.90_rs6596075 5.4010 °° 319x10°°% 454 401 C G ~= 1.55 (1.00-2.39) 2.06 (1.35-3.14) 0.166 0.127 
cD 6p22 ~—-20.83-20.85 rs6908425. 5.13x10°° 110x10°° 355 338 C T 1.63 (1.18-2.25) 1.95 (1.43-267) 0.230 0.190 
cD 6p21 32.79-32.91 rs9469220 865x109 228x10°°% 419 392 A A 1.14 (0.98-1.32) 1.52 (1.28-1.79) 0.481 0.534 
cD 6q23. =: 138.06-138.17._rs7753394.  442x10°°% 259x10°° 352 299 C C  1.21(1.04-140) 1.48 (1.25-1.76) 0.482 0.531 
cD 7q36 147.62-147.70 rs7807268 689x10°% 442x10°°% 333 358 G G  1.38(1.20-1.60) 1.47 (1.24-1.74) 0.462 0.509 
cD Op15 38.52-38.57 rs6601764. -256X10°°% 895x100 °% 374 301 C C  1.16(1.01-1.33) 1.52 (1.28-180) 0.408 0.458 
cD 9q13 _50.89-51.07 rs8111071 614%x10°° 1.75x10°° 348 329 G G _= 147 (1.25-1.73) 1.28 (0.56-2.88) 0.070 0.096 
HT qg43.—s- 235.67-235.79_+s2820037 5.76X10 °° 7.66x10°° 254 399 T T 54 (1.03-2.31) 1.09 (0.74-1.62) 0.141 0.171 
HT 8q24 140.17-140.35 1s6997709 7.88x10° 436x10°° 332 260 G T  1.20(0.94-1.52) 1.49(1.18-1.89) 0.285 0.244 
HT 2p12 = 24.86-24.95 rs7961152, 7.39 10°°%° 3.03x10°° 329 251 A A 116(1.01-1.32) 1.47 (1.25-174) 0415 0.461 
HT 2q23 ©100.52-100.58 rs11110912 9.18x10°° 194x10°° 327 311 G G ~  1.33(1.18-151) 1.34 (0.96-186) 0.165 0.200 
HT 3q21  66.90-67.04 rs1937506 9.23 10° 453x10°° 325 285 G A 1.33(1.04-1.69) 1.60 (1.26-2.02) 0.289 0.248 
HT 5q26 94.60-94.67 rs2398162 7.85x10° 567x10°% 333 340 A G_ 0.97 (0.76-1.25) 1.31(1.03-167) 0.258 0.218 
RA p36 = 2.44-2.77 rs6684865 5.37 10° 314x10°° 347 297 G A 1.27 (1.02-1.56) 1.54(1.25-1.90) 0.338 0.294 
RA p31 80.16-80.36 rs11162922 1.80 x 10° : 411 380 A G_~ 1.27(0.41-4.01) 2.00(0.64-6.20) 0.072 0.048 
RA 4p15 —- 24.99-25.13 rs3816587.  7.65x 10°? 9.25x10°°% 050 264 C C_ 0.91 (0.80-1.04) 1.35(1.14-159) 0.406 0.434 
RA 6q23 138.00-138.06 +s6920220 4.99x10°% 158x10°° 349 317 A A 1.20(1.06-1.36) 1.72 (1.33-2.22) 0.223 0.263 
RA 7q32 130.80-130.84  rs11761231 174x10°°% 265x10°°% 392 342 C T 144(1.19-1.75) 1.64(1.35-1.99) 0.375 0.327 
RA 0p15 6.07-6.16 rs2104286 7.02xX10°° 252x10°° 337 257 T C  1.41(1.10-1.81) 1.68 (1.31-214) 0.286 0.244 
RA 3q12 19.845-19.855 r1s9550642  844x10°% 390x100 °° 335 302 A A 1.34(1.15-1.56) 2.23 (1.21-4.13) 0.084 0.112 
RA 21q22 41.430-41.465 rs2837960 3.45X10° 168x10°°% 005 270 G G_ 0.95 (0.83-1.08) 2.30 (1.64-3.23) 0171 0.188 
RA 22q13.-35.870-35.885__rs743777 7.92x10°°% 115x10°°% 3.29 352 G G_~ 1.09(0.97-1.24) 1.72(1.40-2.11) 0.292 0.336 
T1D qg42 = 221.92-222.17_+s2639703  846xX10°% 1.74x10°° 3.25 3.06 C C  1.15(1.02-1.30) 1.61(1.31-1.99) 0.276 0.318 
TID 4q27 = 123.02-123.92_rs17388568 5.01 10°? 3.27x10°°% 442 389 A A 1.26(1.11-142) 1.58 (1.27-1.95) 0.260 0.307 
TID 5q14 86.20-86.50 rs2544677 8.23 X 10° + 443x10°°° 332 270 C G ~ 1.34(1.00-1.79) 1.65 (1.24-2.18) 0.242 0.204 
TID = 5q31 132.64-132.67  1s17166496 6.06x10° 520x10°° -0.97 3.25 C G_ 0.77 (0.68-0.87) 1.09 (0.92-1.29) 0.391 0.386 
T1D 0p15 6.07-6.18 rs2104286 7.96 x10°°° 432x10°° 331 288 T C  1.30(1.02-165) 1.57 (1.25-1.99) 0.286 0.245 
T1D 2p13.—-9.71-9.80 rs11052552 10210 7.24x10°° 222 380 G T 1.49 (1.28-1.73) 1.43 (1.21-169) 0.486 0.446 
T1D 8pll  12.76-12.91 rs254215 189x10°°% 116x10°° 391 352 G G 1.30(1.15-147) 1.62(1.17-2.24) 0.163 0.201 
T2D p31 66.04-66.36 rs4655595  268xX10 °° 133x10° 381 347 G G ~= 1.37(1.17-1.59) 2.33 (1.23-4.42) 0.080 0.108 
T2D =.2q24 160.90-161.17 rs6718526 2.40x10°° 116x10°° 386 3.35 C T  149(1.05-2.11) 1.86 (1.32-2.63) 0.209 0.171 
T2D 3p14 55.24-55.32 rs358806 477 x10°° 3.05x10°° -083 2.72 A A 0.86 (0.75-0.97) 1.78 (1.34-2.36) 0.198 0.204 
T2D 4q27 122.92-123.02 rs7659604 21x10°°% 942x10°°% 013 274 T T  1.35(1.19-1.54) 1.09(0.91-1.30) 0.380 0.403 
T2D 0q11  43.43-43.63 rs9326506 7.78x10 °° 299x10°° 327 292 C C  1.28(1.11-148) 1.46 (1.24-1.72) 0.492 0.538 
T2D 2q13  49.50-49.87 rs12304921 5.37x10°° 7.07x10 °° -0.09 268 G G_ 2.50 (1.53-4.09) 1.94 (1.20-3.15) 0145 0.159 
T2D 2q15 69.58-69.96 rs1495377, -131x10°° 652x10°°% 401 315 G G = 1.28(1.11-149) 1.51 (1.28-1.78) 0.497 0.547 
T2D 5q24 72.24-72.50 rs293029 772x*10°°% 440x10°° 330 242 G A 1.25(1.04-1.51) 1.50 (1.24-1.82) 0.377 0.332 
T2D 5q25  78.12-78.36 rs2903265 957x10°°% 498x10°° 324 253 G A 1.18 (0.93-149) 1.47 (1.17-186) 0.284 0.243 
Regions with at least one SNP with a P value of greater than 5x10’ and less than 1x10 ° for either the trend or the genotypic test. Columns as for Table 3. Cluster plots for each SNP have been 
inspected visually. Positions are in NCBI build-35 coordinates. Genotypic P values were not calculated for SNPs with the lowest MAFs owing to low numbers of rare-allele homozygotes and sensitivity 
to genotype calling errors. 


recently detected signal®. Finally, strong association with a cluster of 
SNPs around rs17234657 (P= 2.1 X10 '%) within a 1.2 Mb gene 
desert on chromosome 5p13.1, recapitulates the finding of a recent 
GWA study®’. 

The current study identifies four further new strong association 
signals in CD, located on chromosomes 3p21, 5q33, 10q24 and 18p11 
(Table 3; Fig. 5). Successful replication for all four loci is reported 
elsewhere”. 

The first of these includes several SNPs around IRGM (immunity- 
related guanosine triphosphatase; the human homologue of the 
mouse Irgm/Lrg47), the strongest signal being at rs1000113 (P= 
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5.1 X 10 8). IRGM encodes a GTP-binding protein which induces 
autophagy and is involved in elimination of intracellular bacteria, 
including Mycobacterium tuberculosis”. Reduced function and/or 
activity of this gene would be expected to lead to persistence of 
intracellular bacteria, consistent with existing models of CD patho- 
genesis” and the recent ATGI6L1 association™ (see above). 

The second novel CD association is seen at rs9858542 (P= 
7.7 X 10”), a synonymous coding SNP within the BSN (bassoon) 
gene on chromosome 3p21. BSN is thought to encode a scaffold 
protein expressed in brain and involved in neurotransmitter 
release; a more plausible regional candidate is MST1 (macrophage 
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stimulating 1), which encodes a protein influencing motile activity 
and phagocytosis by resident peritoneal macrophages. 

The third novel association involves a cluster of SNPs around 
1510883365 (P=1.4 X 10 °) on chromosome 10q24.2. The most 
credible candidate here is the NKX2-3 (NK2 transcription factor 
related, locus 3) gene, a member of the NKX family of homeodo- 
main-containing transcription factors. Targeted disruption of the 
murine homologue of NKX2-3 results in defective development of 
the intestine and secondary lymphoid organs®. Abnormal expression 
of NKX2-3 may alter gut migration of antigen-responsive lympho- 
cytes and influence the intestinal inflammatory response. 

The final novel association, at rs2542151 (P= 4.6 X 10 °) maps 
5.5-kb upstream of PTPN2 (protein tyrosine phosphatase, non- 
receptor type 2) on chromosome 18p11. PTPN2 encodes the T cell 
protein tyrosine phosphatase TCPTP, a key negative regulator of 
inflammatory responses. The same locus also shows strong asso- 
ciation with T1D susceptibility (trend test P= 1.9 10~°) and a 
consistent, though weaker, association with RA (P= 1.9 x 10), 
supporting the existence of overlapping pathways in the pathogenesis 
of very distinct inflammatory phenotypes (combined trend test 
Pvalue for all three diseases = 9 X 10 8) (Table 3; ref. 10). 

Several further loci generating less strong evidence for association 
are of interest on the basis of their biological candidacy (Table 4). For 
example, rs9469220 (P = 8.7 X 10’) mapping to the human leuko- 
cyte antigen (HLA) system class II region was detected in the ‘second 
tier’ of associations (Table 4). This suggests a significant contribution 
of HLA to CD-susceptibility, though less marked than seen in classical 
autoimmune conditions such as RA and TID. Another interesting 
candidate flagged in Table 4 is TNFAIP3 (TNFa induced protein 3), 
the closest gene to rs7753394 on chromosome 6q23. The protein prod- 
uct inhibits TNFo-induced NF«B-dependent gene expression by 
interfering with RIP- or TRAF-2-mediated transactivation signals— 
hence interacting with the same pathway as CARDI5 (NOD2). 
Markers with lower levels of significance include 1s6478108 (P= 
9.0 X 10°) within TNFSF15 (tumour necrosis factor super family, 
member 15), previously reported associated with CD”; and 
183816769 (P = 3.1 X 10 >) which maps within STATS (signal trans- 
ducers and activator of transcription, member 3). On the X chro- 
mosome r1s2807261 (P=1.3X 10’) maps 50-kb from the gene 
CD40LG (CD40 ligand—previously known as TNF superfamily, 
member 5), implicated in the regulation of B-cell proliferation, adhe- 
sion and immunoglobulin class switching”’. As described in the section 
on T1D, a modest association between CD and SNPs in the vicinity of 
the PTPN11 gene on chromosome 12q24 (P= 1.5 X 107°) probably 
reflects a locus influencing general autoimmune predisposition. 

An emerging theme from molecular genetic studies of CD is the 

importance of defects in autophagy and the processing of phagocy- 
tosed bacteria. A number of other specific components within innate 
and adaptive immune pathways are also highlighted. 
Hypertension (HT). Hypertension refers to a clinically significant 
increase in blood pressure and constitutes an important risk factor 
for cardiovascular disease (http://www.who.int/whr/2002/en/; ref. 
72). Lifestyle exposures that elevate blood pressure, including sodium 
intake, alcohol and excess weight’”’ are well-described risk factors. 
Genetic factors are also important’*”*. Estimates of /, are approxi- 
mately 2.5-3.5. 

Experimental models have highlighted a number of quantitative 
trait loci but these have yet to translate into insights into human 
hypertension’”®. Linkage studies are consistent with susceptibility 
genes of modest effect size’’ and well-replicated findings have yet 
to emerge from association approaches. 

None of the variants previously associated with HT showed evid- 
ence for association in our study although we note that some, such as 
promoter of the WNKI1 (WNK lysine deficient protein kinase 1) 
gene’*”’, are not well tagged by the Affymetrix chip. 

For HT there were no SNPs with significance below 5 X 107 
(Table 3) but the number and distribution of association signals in 
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the range 10-*to 10°’ was similar to that of the other diseases studied 
(Table 4 and Supplementary Table 7). There are several possible expla- 
nations. First, HT may have fewer common risk alleles of larger effect 
sizes than some of the other complex phenotypes. If so, then identifi- 
cation of susceptibility variants for HT is likely to be reliant on the 
synthesis of findings from multiple large-scale studies. Second, the pre- 
sent study may have failed to detect genuine common susceptibility 
variants of large effect size because they happened to be poorly tagged 
by the set of SNPs genotyped in the current study. If so, further rounds 
of genotyping using resources that offer increased density (or comple- 
mentary SNP sets), and/or improved analytical methods (for example, 
imputation-based) should facilitate their discovery. Third, study of 
HT may be more susceptible than other phenotypes to the diluting 
effects of misclassification bias due to the presence of hypertensive 
individuals within the control samples. If so, power can be improved 
in future studies by use of controls specifically screened to exclude 
individuals with elevated blood pressure. 

The most strongly associated SNPs (Table 4) do not identify genes 
from physiological systems previously implicated by clinical or gen- 
etic studies in hypertension. The strongest signal overall is with 
182820037 on 1q43 (genotypic test, P= 7.7 X 10~’). The closest 
genes are RYR2 (encoding the ryanodine receptor 2), mutations in 
which are associated with stress-induced polymorphic ventricular 
tachycardia and arrhythmogenic right ventricular dysplasia*®*’; 
CHRM3, encoding the cholinergic receptor muscarinic 3, a member 
of the G protein-coupled receptor family’; and ZP4, the product of 
which is zona pellucida glycoprotein 4*'. The strong association sig- 
nals on the X chromosome using an expanded reference group (see 
below and Supplementary Table 9) are of substantial interest but they 
do not identify known genes of obvious relevance to HT. 
Rheumatoid arthritis (RA). Rheumatoid arthritis is a chronic 
inflammatory disease characterized by destruction of the synovial 
joints resulting in severe disability, particularly in patients who 
remain refractory to available therapies*’. Susceptibility to, and 
severity of, RA are determined by both genetic and environmental 
factors, with A, estimates ranging from 5-10 (ref. 83). 

An association between RA and alleles of the HLA-DRB1 locus has 
long been established**. Despite extensive linkage**”’ and association 
studies, only one other RA susceptibility locus has been convincingly 
identified in Caucasians. In common with several autoimmune dis- 
eases including T1D, carriage of the T allele of the rs2476601 SNP in 
the PTPN22 (protein tyrosine phosphatase, non-receptor type 22) 
gene has been reproducibly associated with RA, conferring a genetic 
relative risk of approximately 1.8 (refs 88, 89). These known associa- 
tions with HLA-DRB1 and PTPN22 explain around 50% of the famil- 
ial aggregation of RA. 

Both these previous associations emerge strongly here (Table 2). 
The most associated marker within PTPN22 (186679677: chromo- 
some 1p13) is perfectly correlated (HapMap CEU data 77 = 1) with 
the functionally relevant SNP (rs2476601) described previously, and 
the effect size is consistent with previous estimates®. Amongst other 
putative RA susceptibility genes, two SNPs mapping to CTLA-4 (cyto- 
toxic T-lymphocyte associated 4) rs3087243 and rs11571300 were 
only nominally significant (P= 0.085 and P= 0.034, respectively) 
(Supplementary Table 10). 

RA was the sole disease for which the sex-differentiated analysis 
generated a strong signal due to different genetic effects in males and 
females. The SNP rs11761231 (chromosome 7) generates a P value of 
3.9 X 10’ for the 2-degrees of freedom (d.f.) sex-differentiated test 
which combines trend tests in males and females (Table 3). (The 
trend test ignoring the sex of the individuals has a Pvalue of 
1.7 X 10°.) This genotype has no effect on disease status in males, 
but a strong apparently additive effect in females (P value in a logistic 
regression model with additive log-odds is 0.68 in males and 
6.8 X 10° in females, additive OR for females 1.32), and may rep- 
resent one of the first sex-differentiated effects in human diseases. 
Cluster plots for this SNP seem good, but it is surrounded by 
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recombination hotspots and has no other SNPs on the Affymetrix 
chip with 7 > 0.1 (Supplementary Fig. 11). Some caution is therefore 
required, but this represents a potentially interesting finding which 
warrants further investigation, particularly given the sex-related pre- 
valence difference characteristic of this condition. 

None of the 9 SNPs with nominal Pvalues in the range 10 ° to 
510” (Table 4) map to loci previously associated with RA. Of 
particular interest is the association of SNPs mapping close to both 
the alpha and beta chains of the IL2 receptor (1s2104286 in the case of 
IL2RA; rs743777 and IL2RB). The IL2 receptor mediates IL2 stimu- 
lation of T lymphocytes and is thereby thought to have an important 
role in preventing autoimmunity. A rare 4-base-pair deletion of 
IL2RA has been associated with development of severe autoimmune 
disease”’, and there is evidence (from previous data®', and from this 
study and its follow-up) that SNPs within the IL2RA gene region are 
associated with T1D (see also T1D section). 

Several of the SNPs with nominal significance in the range 10° * to 
10° (Supplementary Table 7) map to genes with plausible biological 
relevance. Examples include SNPs within genes implicated in the 
TNF pathway (for example, rs2771369 in TNFAIP2 (tumour necrosis 
factor, alpha-induced protein 2)) or in the regulation of T-cell func- 
tion (rs854350 in GZMB (granzyme B) and rs4750316 in PRKCQ 
(protein kinase C, theta)). The association with rs10786617 in 
KAZALDI1 (Kazal-type serine protease inhibitor domain-containing 
protein 1 precursor), a gene whose product is known to have a role in 
bone regeneration after injury, may be relevant to the development of 
bone erosions in RA. 

RA and T1D were already known to have two disease susceptibility 

genes in common: at the MHC, and at PTPN22. As detailed else- 
where, our study provides data indicating that this list can be 
extended to include variants around IL2RA (chromosome 10p15), 
PTPN2 (chromosome 18p11) and the chromosome 12q24 region 
(Supplementary Table 11), all apparently novel in RA. 
Type 1 diabetes (T1D). Type 1 diabetes is a chronic autoimmune 
disorder with onset usually in childhood”. The A, for T1D is ~15 and 
twin data suggest that over 85% of the phenotypic variance is due to 
genetic factors”’. There are six genes/regions for which there is strong 
pre-existing statistical support for a role in T1D-susceptibility: these 
are the major histocompatibility complex (MHC), the genes encod- 
ing insulin, CTLA-4 (cytotoxic T-lymphocyte associated 4) and 
PTPN22 (protein tyrosine phosphatase, non-receptor type 22), and 
the regions around the interleukin 2 receptor alpha (IL2RA/CD25) 
and interferon-induced helicase 1 genes (IFIH1/MDA5)”*. However, 
these signals can explain only part of the familial aggregation of T1D. 
Five of these previously identified associations were detected in this 
scan (P=0.001) (Table 2 and Supplementary Table 10), the excep- 
tion being the INS gene discussed above. 

In this study, single-point analyses revealed three novel regions (on 
chromosomes 12q13, 12q24 and 16p13) showing strong evidence of 
association (P<5 X10 ’; Table 3). Four further regions attained 
similar levels of significance either through multilocus analyses 
(chromosomes 4q27 and 12p13: Table 3, Supplementary Fig. 12), 
or through the combined analysis of autoimmune cases (chromo- 
somes 18p11 and the 10p15 CD25 region: Table 3, Supplementary 
Fig. 13). The associations with T1D for chromosomes 12q13, 12q24, 
16p13 and 18p11 have been confirmed in independent and multiple 
populations’®. 

The two signals on chromosome 12 (at 12q13 and 12q24) map to 
regions of extensive linkage disequilibrium covering more than ten 
genes (Fig. 5). Several of these represent functional candidates 
because of their presumed roles in immune signalling, considered 
to be a major feature of T1D-susceptibility. These include ERBB3 
(receptor tyrosine-protein kinase erbB-3 precursor) at 12q13 and 
SH2B3/LNK (SH2B adaptor protein 3), TRAFD1 (TRAF-type zinc 
finger domain containing 1) and PTPN11 (protein tyrosine phos- 
phatase, non-receptor type 11) at 12q24. For these signal regions in 
particular, extensive resequencing, further genotyping and targeted 
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functional studies will be essential steps in identifying which gene, or 
genes, are causal”®. Of those listed, PTPN11 is a particularly attractive 
candidate given a major role in insulin and immune signalling”. It is 
also a member of the same family of regulatory phosphatases as 
PTPN22, already established as an important susceptibility gene for 
T1D and other autoimmune diseases’. Indeed, the 12q24 variant 
most associated with T1D also features in both the CD and RA scans, 
generating a combined signal for all autoimmune cases of 9.3 X 
10° (Supplementary Table 11). 

In contrast, available annotations suggest that the 16p13 region 
contains only two genes of unknown function, KIAA0350 and dexa- 
methasone-induced transcript (Fig. 5). Also, the region of association 
identified on 18p11 (Supplementary Fig. 14), which seems to confer 
susceptibility to all three autoimmune conditions studied (combined 
trend test P=9 X 10°, P=4.6 X 10 ° for CD, 1.9 X 10°? for RA, 
and 1.9 X 10 ° for T1D: Supplementary Table 11), maps to a single 
gene, PTPN2 (protein tyrosine phosphatase, non-receptor type 2), a 
member of the same family as PTPN22 and PTPN11 and involved in 
immune regulation”’. 

Our scan found associations with SNPs within the chromosome 
10p15 region containing CD25, encoding the high-affinity receptor 
for IL-2. This is consistent with a previous report of associations of 
this region with T1D”'. The CD25 region has previously been shown 
to be associated with Graves’ disease’ and the present study also 
provides evidence of association with RA (combined trend test 
P=5x10 %, P=~7X10° for RA and TID separately, 
Supplementary Table 11). This finding has clear biological connec- 
tions to the evidence of association between T1D and a region of 4q27 
revealed by the multilocus analysis (Supplementary Table 12, 
Supplementary Fig. 12). This region contains the genes encoding 
both IL-2 and IL-21. Together with studies in the NOD (nonobese 
diabetic) mouse model of T1D, which have shown that a major non- 
MHC locus (Idd3) reflects regulatory variation of the [/2 gene”, our 
results point to the primary importance of the IL-2 pathway in T1D 
and other autoimmune diseases. 

One further region deserves comment. In the multilocus analysis, 

there was increased support for a region on chromosome 12p13 
containing several candidate genes, including CD69 (CD69 antigen 
(p60, early T-cell activation antigen)) and multiple CLEC (C-type 
lectin domain family) genes. In contrast to the chromosome 4 region 
where the effect of imputation is to tip an already-strong signal 
(5.01 X 10” for typed 1s17388568, trend test) over the arbitrary 
threshold of 5 X 107’, the 12p13 locus involves a more marked 
change between imputed and actual (7.2 X 107” for 1s11052552, 
general test). Replication studies of this imputed SNP to date have 
produced equivocal results (for details see ref. 10). 
Type 2 diabetes (T2D). Type 2 diabetes is a chronic metabolic dis- 
order typically first diagnosed in the middle to late adult years’. 
Strongly associated with obesity, the condition features defects in 
both the secretion and peripheral actions of insulin'®'. The appre- 
ciable familial aggregation of T2D (an estimated As of ~3.0 in 
European individuals)” reflects both shared family environment 
and genetic predisposition. Heritability values vary widely with most 
estimates between 30 and 70%". 

To date, robust, widely replicated associations in non-isolate 
populations are limited to variants in three genes: PPARG (encoding 
the peroxisomal proliferative activated receptor gamma; P12A'”), 
KCNJ11 (the inwardly-rectifying Kir6.2 component of the pancreatic 
beta-cell KATP channel; E23K"’) and TCF7L2 (transcription factor 
7-like 2; rs7903146 (refs 104, 105)). 

All three of these signals are detected here with effect-sizes con- 
sistent with previous reports (Table 2). A cluster of SNPs on chro- 
mosome 10q, within TCF7L2, represented by rs4506565 (trend test, 
OR 1.36, P=5.7 X 10 |) generates the strongest association signal 
for T2D (Table 3, Fig. 5). Rs4506565 is in tight linkage disequilibrium 
(7° of 0.92 in the CEU component of HapMap) with rs7903146, the 
variant with the strongest aetiological claims'’’*’°°. In fact, our 
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imputation analysis confirms that rs7903146, though unrepresented 
on the chip, is responsible for the strongest association effect in this 
region (Fig. 5). TCF7L2 acts within the WNT-signalling pathway, 
and effects on diabetes risk seem to be mediated predominantly 
through beta-cell dysfunction’. 

As expected, given existing effect-size estimates, the signals assoc- 
iated with variants within the other established T2D-susceptibility 
genes, KCNJ11 (rs5215, of 0.9 with rs5219, E23K) and PPARG 
(1517036328, 1° of 1 with rs1801282, P12A) are less dramatic (trend 
test, OR 1.15 and 1.23 respectively, both P=~0.001). These examples 
illustrate how genuine disease-susceptibility variants can generate 
association signals which would not attract immediate attention 
for follow-up in the genomewide context. 

Apart from TCF7L2, the scan reveals two signals for T2D with P 
values less than 5 X 107” (Table 3, Fig. 5). The first of these maps 
within the FTO (fat-mass and obesity-associated) gene on chro- 
mosome 16q. Several adjacent SNPs (including 1s9939609, 
rs7193144 and rs8050136) generate signals characterized by a per- 
allele OR for T2D of ~1.25 and a risk-allele frequency of ~40% in 
controls. As recently described in follow-up studies prompted by this 
finding, the effect of these variants on T2D-risk has been replicated 
and is mediated entirely by their marked effect on adiposity™*. 

The third association signal (chromosome 6p22) features a cluster 
of highly associated SNPs (including rs9465871) with risk-allele fre- 
quencies between 18 and 35%, mapping to intron 5 of the CDKAL1 
(CDK5 regulatory subunit associated protein 1-like 1) gene. 
Although the function of CDKAL] is not known, it shares homology 
at the protein domain level with CDK5 regulatory subunit associated 
protein 1 (CDK5RAP1). CDK5RAP1 is known to inhibit the activa- 
tion of CDK5, a cyclin-dependent kinase which has been implicated 
in the maintenance of normal beta-cell function'®*. Our own follow- 
up studies, and scans by other groups have shown strong replication 
of this finding’? **. The effect of this variant on T2D-risk shows 
significant departures from additivity (Supplementary Table 8). 

One notable inclusion amongst the variants with more modest 
association signals is a cluster of SNPs on chromosome 10 including 
rs10748582 and 1s7923866, which generate trend test Pvalues 
between 107* and 10~>. This cluster maps in the vicinity of the 
HHEX (homeobox, hematopoietically expressed) and IDE (insulin- 
degrading enzyme) genes, in a region recently highlighted ina GWA 
scan for T2D performed in 1363 subjects of French origin'®’. The 
SNPs showing association in our data are proxies for those reported 
in the French study and generate similar effect-size estimates for T2D. 

Of the three other regions highlighted by the French scan’, none 
can be confirmed by our data. The SNP in SLC30A8 associated with 
T2D in the French report (rs13266634) is poorly correlated with 
SNPs on the Affymetrix chip (7° < 0.01), and extensive recombina- 
tion events in the region limit the value of data-imputation methods. 
Coverage of the LOC387761 and EXT2 signals is considerably better, 
but, for these, neither genotyped nor imputed SNPs show evidence 
for association with T2D. 

WTCCC data contributed to identification of two additional 
robustly replicating T2D signals, mapping to the IGF2BP2 gene 
and CDKN2A/CDKN2B regions’”*'”*, although neither generated 
impressive P values on the primary scan analysis (neither single-point 
Pwas <10~*). The latter signal maps to the same region as the CAD 
signal on chromosome 9 though different SNPs are involved. The 
other SNPs in Table 4 do not map to genes or regions previously 
implicated in T2D pathogenesis, and replication efforts to date have 
not identified any confirmed signals”. 

Expanded reference group analyses. For a fixed number of cases, 
power of a case-control study can be increased by enlarging the 
reference group. Our main analyses used a control:case ratio of 
1.5:1 for each disease. The availability of the other 6 disease data sets 
gave us the opportunity to expand the reference group up to a ratio of 
~7.5:1, with potential reciprocal benefits for the analysis of each 
disease. For BD and T2D the expanded reference group comprised 
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the 58C and UKBS controls supplemented by the other 6 disease sets; 
for CAD and HT this expanded reference group was reduced to exclude 
HT and CAD respectively; for CD, RA and T1D, the reference group 
was augmented only by the cases from the non-autoimmune diseases. 

The utility of the expanded reference group approach was demon- 
strated by increased evidence for association at most of the loci that 
received strongest support from our primary analysis, including 
many of the signals at loci known to show robust association in 
T1D, T2D and CD (Supplementary Table 9). Additionally, this ana- 
lysis elevated several loci with modest levels of statistical significance 
in the primary analysis, to the top tier of statistical significance 
(P<5X 10’). 

Our data indicate that this approach may be a useful adjunct to 

conventional analysis and that loci identified as highly significant 
should be considered for follow up. There are two important caveats. 
First, susceptibility genes that influence both the test disease and one 
or more of the diseases included in the reference group will cause 
loss of power. Second, a ‘mirror-image’ effect could occur whereby 
a strong association within the expanded reference sample (for 
example, HLA in autoimmune diseases) causes spurious association 
with the opposite allele in the test disease. Thus, a positive association 
using an expanded reference group must be interpreted within the 
context of association findings in the diseases included within the 
reference group. 
Disease models. It is of interest to consider which statistical models 
best describe the data at and between loci that are strongly associated 
with disease status. Biological interpretation of these statistical mod- 
els is not straightforward but they can help in choosing more power- 
ful statistical tools for detecting associations. 

First, consider separately each of the 19 non-MHC SNPs showing 
strong evidence for association on either the trend or genotypic test 
in Table 3. For four of these 19, the P value on the 2-d.f. genotypic test 
was smaller than that on the 1-d.f. trend test (Table 3). When com- 
paring disease models, these were also the four SNPs with evidence 
for departure from a simple model in which odds of disease increase 
multiplicatively with the number of copies of the risk allele (Sup- 
plementary Table 8). This supports our view that the genotypic test 
should be carried out in addition to the trend test, although should 
perhaps be viewed more cautiously for two reasons: it is more sus- 
ceptible to genotyping errors; and (on the basis of our findings) 
experience does not favour strong dominance effects. 

A separate question relates to the best models for the way in which 
different loci combine to affect susceptibility to a disease, and as a 
consequence on the extent to which methods explicitly allowing inter- 
actions between loci should be employed to detect associations''®. 
None of the analyses reported here includes such interactions, so we 
are not well placed to address the general question. Nonetheless, 
within each collection with multiple associated regions (CD, T1D 
and T2D) we considered all pairs of non-MHC SNPs in Table 3 and 
looked for a departure from the model in which the two loci combine 
to increase log-odds in an additive fashion. We found suggestive evid- 
ence of a departure from multilocus additivity between rs1000113 
and rs10761659 in CD (unadjusted P value = 0.002) and between 
rs9465871 and rs4506565 in T2D (unadjusted P value = 0.004). 
Further investigation of this question, preferably on unbiased sets 
of disease loci found through the application of single locus and 
interaction-based approaches, would seem warranted. 


Discussion 
We have studied seven common familial diseases by genome-wide 
association analysis in 16,179 individuals. Our findings inform 
understanding of the genetic basis of the diseases concerned and 
provide methodological insights relevant to the pursuit of GWA 
studies in general. 

A simple but important observation is that GWA analysis provides 
a highly effective approach for exploring the genetic underpinnings 
of common familial diseases. Our yield of novel, highly significant 
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association findings is comparable to, or exceeds, the number of 
those hitherto-generated by candidate gene or positional cloning 
efforts. For many of the compelling signals, replication has already 
been obtained, including regions on chromosomes 3p21, 5q33, 
10q24 and 18p11 for CD”, 12q13, 12q24, 16p13 and 18p11 in 
TID” and 6p22 and 16q12 in T2D'***. For others, replication is 
required to establish a definitive relationship with disease. Additional 
findings of particular interest include the identification of several loci 
that seem to influence susceptibility to multiple autoimmune dis- 
eases, and the suggestion of a novel locus for RA which shows sex- 
specific effects. 

Our study enables us to make several general recommendations 
relevant to GWA studies. The first relates to the importance of careful 
quality control. In such large data sets, small systematic differences 
can readily produce effects capable of obscuring the true associations 
being sought'''''?. We implemented extensive quality control checks 
to minimize differences in sample DNA concentration, quality and 
handling procedures and combined a new genotype-calling algo- 
rithm (CHIAMO) with a set of filtering heuristics to select SNPs 
for further analysis. Given that infallible detection of incorrect geno- 
type calls is not yet possible, the criteria used for SNP exclusion need 
to strike a compromise between stringency (which may discard true 
signals or generate spurious positives through differential missing- 
ness) and leniency (with the danger that true signals are swamped by 
spurious findings due to poor genotype calling). As such, systematic 
visual inspection of cluster plots for SNPs of interest remains an 
integral part of the quality control process. 

The potential for population structure to undermine inferences in 
case-control association studies has long been debated'”’ but limited 
empirical data have been available to assess the issue. Our study 
highlighted several loci, some known and some new, which dem- 
onstrate substantial geographical variation in allele frequencies 
across Britain (Table 1), most probably due to natural selection in 
ancestral populations. Outside these loci, the effects of population 
structure are relatively minor, and do not represent a major source of 
confounding, provided that individuals with appreciable non- 
European ancestry are excluded. Although these conclusions may 
not generalize to studies in other locations, this finding reinforces 
the logistical and economic benefits of the case-control design over 
alternatives (such as family-based association studies). 

Our study allowed us to address another important methodo- 
logical issue: the adequacy, or otherwise, of using a common set of 
controls, rather than a sample recruited explicitly for use with a 
defined disease sample. It is often assumed that failure to match cases 
and controls for socio-demographic variables will lead to substantial 
inflation of the type I error rate. Our study demonstrates that, within 
the context of large-scale genetic association studies, for British 
populations at least, this concern has been overstated. A related argu- 
ment against use of population controls relates to the perceived 
impact of misclassification bias when a proportion of controls meet 
the criteria used to define cases. However, the consequent loss of 
power is modest unless the trait of interest is very common’. Given 
the above, the present study provides a compelling case for both the 
suitability and efficiency of the common control design in Britain and 
warrants its serious consideration elsewhere. Further benefits can be 
expected from use of this common control genotype data set in future 
GWA studies in Britain. Finally, in failing to detect significant differ- 
ences in performance between the epidemiological sample (58C) and 
that derived from blood donors (UKBS), we validate the use of the 
latter samples for cost-effective, large-scale control DNA provision. 

In terms of general biological insights, the most profound relate to 
inferences about the allelic architecture of common traits. The novel 
variants we have uncovered are characterized by modest effect size 
(that is, per-allele ORs between 1.2 and 1.5) and even these estimates 
are likely to be inflated''*. We identified no additional common 
variants of very large effect (akin to HLA in T1D: Supplementary 
Fig. 15). The observed distribution of effect sizes is consistent with 
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models based on theoretical considerations and empirical data from 
animal models*”'!*"'* that suggest that, for any given trait, there will 
be few (if any) large effects, a handful of modest effects and a sub- 
stantial number of genes generating small or very small increases in 
disease risk. 

There are several important corollaries. Notwithstanding the 
incomplete coverage afforded by the genotyping reagents employed, 
most of the susceptibility effects yet to be uncovered for these diseases 
(at least those attributable to, or tagged by, common SNPs) are likely 
to have effects of similar or smaller magnitude to those we have 
highlighted. Beyond the signals with the strongest evidence for asso- 
ciation, most of which are likely to be real (and many of which have 
already been confirmed), there will be many additional susceptibility 
variants for which the WTCCC provides some evidence, but for 
which extensive replication will be required to establish validity. 
PPARGand KCNJ11 provide examples of proven susceptibility genes 
(for T2D) that generated only modest evidence for association within 
the WTCCC, and which would only have been revealed by such 
replication efforts. Given the likely preponderance of susceptibility 
variants of small effect, the potential for identifying further loci is 
limited only by the clinical resources available for replication (assum- 
ing suitable study design, accurate genotyping and appropriate ana- 
lysis and inference). Provided the attribution of a causal relationship 
with the trait of interest is robust, even variants of very small effect 
can offer fundamental biological insights. 

The patterns of allelic architecture uncovered mean that replica- 
tion efforts will need to feature comparably large sample sizes: even if 
one accepts more relaxed significance thresholds given the prior 
evidence, one has to consider the inflation in effect-size estimates 
in the primary study. Caution is required in reaching negative con- 
clusions on the basis of a single failed attempt at replication, or any set 
of replication attempts that are inadequately powered. 

One of our major design considerations was sample size. We set 
out to include samples larger than those previously examined for 
genome-wide association, and our results suggest that such large 
sample sizes were necessary. Even with 2,000 cases and 3,000 controls, 
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Figure 6 | Strong associations in subsamples of our data. For the 16 SNPs 
in Table 3 (outside the MHC) with P values for the trend test below5 X 10 7, 
we randomly generated 1,000 subsets of our full data set corresponding to 
case-control studies with different numbers of cases, and the same number 
of controls (x axis). The y axis gives the proportion of subsamples of a given 
size in which that SNP achieved a P value for the trend test below 5 X 10 ’. 
SNPs are numbered according to the row in which they occur in Table 3 (so 
that, for example, the CAD hit is numbered 2, and the TCF7L2 hit on 
chromosome 10 for T2D is numbered 20). 
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adequate power is restricted to common variants of relatively large 
effect (see Supplementary Table 2). We carried out an experiment to 
see which SNPs showing strong evidence of association in the full 
data (that is, signals outside MHC with trend test P<5 X 10”), 
would have been detected at that same threshold in only a subset 
of our data (Fig. 6). Because it focuses on a particular but arbitrary 
P-value threshold, some care is needed in interpreting the figure. 
Nonetheless, for subsamples of 1,000 cases and 1,000 controls, of 
the 16 loci detected in the full study, we would have been certain of 
seeing only 2, with an expectation of about 6; for subsamples of 1,500 
cases and 1,500 controls, we could expect to have seen about 9. These 
figures provide stark evidence that the larger the study sample, the 
more loci can be expected to reach threshold significance values. 
Indeed, given the likely distribution of effect sizes for most complex 
traits (see above), there are strong grounds for the prosecution of 
GWA studies on an even larger scale than ours, and, wherever pos- 
sible, combining the results from existing GWA scans performed for 
the same trait. To assist such efforts, individual level data from this 
study will be widely available through the Consortium’s Data Access 
Committee (follow links from http://www.wtccc.org.uk). 

In our study, T1D and CD, the conditions showing strongest 
familial aggregation (as quantified by their sibling relative risks, 
As), generated the largest number of highly significant associations. 
This relationship was not sustained in comparisons between the 
other five diseases. It is important to recognize that the association 
signals so far identified account for only a small proportion of overall 
familiality. There is a disparity in scale between the modest locus- 
specific A, effects attributable to the identified associations (for 
instance, the prominent TCF7L2 signal for T2D translates into a A, 
of only 1.03) and the estimates of overall familiality that reflects the 
combined effects of all genes and shared family environment. These 
estimates demonstrate the limited potential of the variants thus far 
identified (singly or in combination) to provide clinically useful 
prediction of disease'!”"’. 

The identification and characterization of the aetiological variants 
that underlie replicated associations will necessitate extensive fine- 
mapping and functional validation. We view the WTCCC study and 
data set as an important first step towards harnessing the powerful 
molecular genomic tools now available to dissect the biological basis 
of common disease and translating those findings into improvements 
in human health. 


METHODS SUMMARY 


A detailed description of materials and methods is given in Methods. The work- 
flow and organization of the project are given in Supplementary Fig. 16. Case 
series came from previously established collections with nationally represent- 
ative recruitment: 2,000 samples were genotyped for each. The control samples 
came from two sources: half from the 1958 Birth Cohort and the remainder from 
a new UK Blood Service sample. The latter collection was established specifically 
for this study and is a UK national repository of anonymized DNA samples from 
3,622 consenting blood donors. The vast majority of subjects were self-reported 
as of European Caucasian ancestry. All DNA samples were requantified and 
tested for degradation and PCR amplification. Genotyping was performed using 
GeneChip 500K arrays at the Affymetrix Services Lab (California): arrays not 
passing the 93% call rate threshold at P= 0.33 with the Dynamic Model algo- 
rithm were repeated. CEL (cell intensity) files were transferred to WT'CCC for 
quantile normalization, and genotypes called using a new genotyping algorithm, 
CHIAMO, developed for this project. QC/QA measures included sample call 
rate, overall heterozygosity and evidence of non-European ancestry (809 samples 
excluded; 16,179 retained for analysis). SNPs were excluded from analysis 
because of missing data rates, departures from Hardy-Weinberg equilibrium 
and other metrics (31,011 excluded; 469,557 retained). Standard 1-d.f. and 2-d.f. 
tests of case-control association were supplemented with bayesian approaches, 
multilocus methods (data imputation) and analyses with combined data sets, 
either as additional cases (to detect variants influencing multiple phenotypes) 
or as an expanded reference group (to increase power). Results for each SNP for 
all analyses reported will be available from http://www.wtccc.org.uk, as will 
details allowing other researchers to apply for access to WT'CCC genotype data. 
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Software packages developed within the WTCCC are available on request (see 
Methods for details). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

BD phenotype description. BD cases were all over the age of 16yr, living in 
mainland UK and of European descent. Recruitment was undertaken through- 
out the UK by teams based in Aberdeen (8% of cases), Birmingham (35% cases), 
Cardiff (33% cases), London (15% cases) and Newcastle (9% cases). Individuals 
who had been in contact with mental health services were recruited if they 
suffered with a major mood disorder in which clinically significant episodes of 
elevated mood had occurred. This was defined as a lifetime diagnosis of a bipolar 
mood disorder according to Research Diagnostic Criteria!’ and included the 
bipolar subtypes that have been shown in family studies to co-aggregate for 
example”: bipolar I disorder (71% cases), schizoaffective disorder bipolar type 
(15% cases), bipolar II disorder (9% cases) and manic disorder (5% cases). After 
providing written informed consent, all subjects were interviewed by a trained 
psychologist or psychiatrist using a semi-structured lifetime diagnostic psychi- 
atric interview (in most cases the Schedules for Clinical Assessment in 
Neuropsychiatry'”° and available psychiatric medical records were reviewed). 
Using all available data, best-estimate ratings were made for a set of key pheno- 
typic measures on the basis of the OPCRIT checklist (which covers both psycho- 
pathology and course of illness)'*"!** and lifetime psychiatric diagnoses were 
assigned according to the Research Diagnostic Criteria'’’. The reliability of these 
methods has been shown to be high''®!”*'*4. Further details of clinical methodo- 
logy can be found in Green, 2005 (ref. 123) and Green, 2006 (ref. 124). 

CAD phenotype description. CAD cases had a validated history of either myo- 
cardial infarction or coronary revascularization (coronary artery bypass surgery 
or percutaneous coronary angioplasty) before their 66th birthday. Verification 
of the history of CAD was required either from hospital records or the primary 
care physician. Recruitment was carried out on a national basis in the UK 
through a direct approach to the public via (1) the media and (2) mailing all 
general practices (family physicians) with information about the study, as prev- 
iously described’*’. In an initial pilot phase, potential participants were also 
identified and approached through local CAD databases in the two lead centres 
(Leeds and Leicester). Although the majority of subjects had at least one further 
sib also affected with premature CAD, only one subject from each family was 
included in the present study. 

CD phenotype description. CD cases were attendees at inflammatory bowel 
disease clinics in and around the five centres which contributed samples to the 
WTCCC (Cambridge, Oxford, London, Newcastle, Edinburgh). Ascertainment 
was based on a confirmed diagnosis of Crohn’s disease (CD) using conventional 
endoscopic, radiological and histopathological criteria'’°. We included all sub- 
types of CD as classified by disease extent and behaviour and the collection was 
not specifically enriched for family history or early age of onset. The median age 
of diagnosis was 26.1 yr and 62% of the collection had undergone CD-related 
abdominal surgery. A small proportion had previously been recruited as mem- 
bers of multiply affected families but only one affected individual was included 
per family. 

HT phenotype description. HT cases comprised severely hypertensive probands 
ascertained from families with multiplex affected sibships or as parent—offspring 
trios. They were of white British ancestry (up to level of grand-parents) and were 
recruited from the Medical Research Council General Practice Framework and 
other primary care practices in the UK”. Each case had a history of hypertension 
diagnosed before 60 yr of age, with confirmed blood pressure recordings corres- 
ponding to seated levels >150/100 mm Hg (if based on one reading), or the 
mean of 3 readings greater than 145/95 mm Hg. These criteria correspond to 
the threshold for the uppermost 5% of blood pressure distribution in a contem- 
poraneous health screening survey of 5,000 British men and women in 1995 (N. 
Wald and M. Law, personal communication). We excluded hypertensive indi- 
viduals who self-reportedly consumed >21 units of alcohol per week and those 
with diabetes, intrinsic renal disease, a history of secondary hypertension or co- 
existing illness. Cases did not undergo systematic genetic screening to exclude 
the (rare) known monogenic causes of HT. We focused on the recruitment of 
hypertensive individuals with body mass indices <30kgm *. The probands 
were extensively phenotyped by trained nurses (see http://www.brightstudy. 
ac.uk for standard operating procedures, additional phenotypes and study ques- 
tionnaires). Sample selection for WT'CCC was based on DNA availability and 
quality. 

RA phenotype description. RA cases were recruited to studies coordinated by 
the ARC (Arthritis Research Campaign) Epidemiology Unit. All subjects were 
Caucasian over the age of 18 yr and satisfied the 1987 American College of 
Rheumatology Criteria for RA'’” modified for genetic studies'**. Of the cases, 
404 were recruited as part of the arc National Repository of Family Material'”’: of 
these, 301 were probands from affected sibling pair families and 103 were cases 
from trio families, having both parents or one parent and one unaffected sibling 
available for study. A further 109 cases were recruited from the Norfolk Arthritis 
Register, a primary care-based inception collection'”. All other cases (n = 1348) 
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were recruited from NHS Rheumatology Clinics throughout the UK. Samples 
for WTCCC were selected from the various studies on the basis of the quality and 
availability of DNA. 

T1D phenotype description. T1D cases were recruited from paediatric and adult 
diabetes clinics at 150 National Health Service hospitals across mainland UK. 
The total T1D case data set (n=~8,000) from which the WTCCC cases were 
selected, represents close to half the T1D cases seen in such clinics. Nationwide 
coverage was achieved through the voluntary efforts of members of the British 
Society for Paediatric Endocrinology and Diabetes, who recruited about half of 
cases, the rest coming from peripatetic nurses employed by the JORF/WT GRID 
project (http://www-gene.cimr.cam.ac.uk/todd/)'*'. To establish a positive diag- 
nosis of T1D (and, in particular, to distinguish it from the more common, but 
later onset T2D), we required all cases to have an age of diagnosis below 17 yr and 
insulin dependence since diagnosis (with a minimum period of at least 6 
months). However, a very few subjects were subsequently discovered to be suf- 
fering from rare monogenic disorders, such as maturity onset diabetes of the 
young (MODY), and latterly permanent neonatal diabetes (PNDM): these were 
excluded. 

T2D phenotype description. The T2D cases were selected from UK Caucasian 
subjects who form part of the Diabetes UK Warren 2 repository. In each case, the 
diagnosis of diabetes was based on either current prescribed treatment with 
sulphonylureas, biguanides, other oral agents and/or insulin or, in the case of 
individuals treated with diet alone, historical or contemporary laboratory evid- 
ence of hyperglycaemia (as defined by the World Health Organization). Other 
forms of diabetes (for example, maturity-onset diabetes of the young, mitochon- 
drial diabetes, and type 1 diabetes) were excluded by standard clinical criteria 
based on personal and family history. Criteria for excluding autoimmune dia- 
betes included absence of first-degree relatives with T1D, an interval of =1 yr 
between diagnosis and institution of regular insulin therapy and negative testing 
for antibodies to glutamic acid decarboxylase (anti-GAD). Cases were limited to 
those who reported that all four grandparents had exclusively British and/or Irish 
origin, by both self-reported ethnicity and place of birth. All were diagnosed 
between age 25 and 75. Approximately 30% were explicitly recruited as part of 
multiplex sibships'*? and ~25% were offspring in parent-offspring ‘trios’ or 
‘duos’ (that is, families comprising only one parent complemented by additional 
sibs)'**. The remainder were recruited as isolated cases but these cases were 
(compared to population-based cases) of relatively early onset and had a high 
proportion of T2D parents and/or siblings'**. Cases were ascertained across the 
UK but were centred around the main collection centres (Exeter, London, 
Newcastle, Norwich, Oxford). Selection of the samples typed in WTCCC from 
the larger collections was based primarily on DNA availability and success in 
passing Diabetes and Inflammation Laboratory (DIL)/Wellcome Trust Sanger 
Institute (WTSI) DNA quality control. 

1958 Birth Cohort Controls (58BC). The 1958 Birth Cohort (also known as the 
National Child Development Study) includes all births in England, Wales and 
Scotland, during one week in 1958. From an original sample of over 17,000 
births, survivors were followed up at ages 7, 11, 16, 23, 33 and 42 yr (http:// 
www.cls.ioe.ac.uk/studies.asp?section=000100020003)'*. In a biomedical 
examination at 44-45 yr'’® (http://www.b58cgene.sgul.ac.uk/followup.php), 
9,377 cohort members were visited at home providing 7,692 blood samples with 
consent for future Epstein-Barr virus (EBV)-transformed cell lines. DNA sam- 
ples extracted from 1,500 cell lines of self-reported white ethnicity and repres- 
entative of gender and each geographical region were selected for use as controls. 
UK Blood Services Controls (UKBS). The second set of common controls was 
made up of 1,500 individuals selected from a sample of blood donors recruited as 
part of the current project. WTCCC in collaboration with the UK Blood Services 
(NHSBT in England, SNBTS in Scotland and WBS in Wales) set up a UK 
national repository of anonymized samples of DNA and viable mononuclear 
cells from 3,622 consenting blood donors, age range 18-69 yr (ethical approval 
05/Q0106/74). A set of 1,564 samples was selected from the 3622 samples 
recruited based on sex and geographical region (to reproduce the distribution 
of the samples of the 1958 Birth Cohort) for use as common controls in the 
WTCCC study. DNA was extracted as described below with a yield of 
3054 + 1207 tg (mean + 1 s.d.). 

Protocol for DNA extraction. White blood cells were isolated from the filters by 
first pushing 10 ml air through the filter in contra direction to the initial blood 
flow through the filter, followed by 40 ml PBS, collecting into a 50 ml centrifuge 
tube, and centrifugation (2.000 r.p.m., 10 min, 20 °C).Cells were lysed by adding 
40 ml Lysis buffer (320 mM Sucrose, 1% Triton-X-100, 4.9mM MgCl, 1mM 
TRIS-HCI pH 7.4) and pelleted by centrifugation (2,500 r.p.m., 15 min, 4 °C). 
Pellets were frozen before extraction. Pellets were digested overnight at 37°C 
with 5.25M GuHCl, 490 mM NH,Ac, 1.25% Na Sarcosyl and 0.125 mg ml! 
Proteinase K and then mixed with 2 ml chloroform to form a white emulsion. 
The aqueous layer was separated by centrifugation (2,500r.p.m., 3 min) and 
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DNA was precipitated in ethanol overnight at —20°C. DNA was further pre- 
cipitated by rotation (40r.p.m., 5min) and then pelleted by centrifugation 
(3,000 r.p.m., 15min). Pellets were washed twice by rinsing with 2 ml 70% 
ethanol, followed by centrifugation (3,000r.p.m., 5min). DNA pellets were 
air-dried before re-suspension in TE buffer (10 mM Tris, 0.1 mM EDTA). 
Sample handling. Each participating sample collection was issued unique 
WTCCC barcode labels and a spreadsheet with unique sample identifiers for 
logging information on case/control status, DNA concentration (requested at 
100 ng pl '), DNA extraction method, sex, broad geographical region and age at 
requirement. Each collection supplied 10 tig aliquots of anonymized samples in 
bar-coded, deep 96-well plates. On receipt, samples had their DNA concentra- 
tion measured by Picogreen (triplicate measurements), were checked for DNA 
degradation on a 0.75% agarose gel, and genotyped with up to 38 SNPs arranged 
in two multiplex reactions using the MassExtend (hME) and/or iPLEX” assay. 
The above SNPs served for obtaining a molecular fingerprint (25 of the 38 SNPs 
were present on the GeneChip 500K) and experimentally confirming the sex of 
each sample. 

Samples with concentrations =50 ng pl ', showing limited or no degrada- 

tion, having a minimum of 7/10 (hME reaction) and/or 14/23 (iPLEX reaction) 
SNPs typed, and having the sex markers in agreement or not violating the 
supplied information were deemed fit for whole genome genotyping. Note that 
the hME set was replaced with a second iPLEX reaction in the course of the 
project to increase marker density. We selected 2,000 and 1,500 samples from 
each disease and control collection respectively. Selected samples were normal- 
ized to 50 ng pl! and re-arrayed robotically into 96-well plates so that each plate 
was composed of 94 samples representing at least two different collections at a 
ratio of 1:1. For each collection, the selected samples were balanced first for sex 
and then geographical region (see above). 
Genotyping. SNP genotyping was performed with the commercial release of the 
GeneChip 500K arrays at Affymetrix Services Lab. A modified version of the 
genotyping assay developed for the 100K Mapping Array!’ was used. In brief, 
two aliquots of 250 ng of DNA each are digested with Nspl and Styl, respectively, 
an adaptor is ligated and molecules are then fragmented and labelled. At this 
stage each enzyme preparation is hybridized to the corresponding SNP array 
(262,000 and 238,000 on the Nspl and Styl array respectively). Samples were 
processed in 96-well plate format, each plate carried a positive and a negative 
control, up to the hybridization step. Individual arrays not passing the 93% call 
rate threshold at P= 0.33 with the Dynamic Model algorithm’** were repeated 
(fresh aliquot of initial end-labelled reaction). Samples failing twice at the 
hybridization stage were reprocessed using a fresh DNA aliquot. Affymetrix 
delivered successful samples as those having a Dynamic Model call rate of 
93% at P= 0.33 for each array, over 90% concordance for the 50 SNPs that 
are common to the two arrays, both arrays agreed on gender, and showed over 
70% identity to the Sequenom genotypes supplied by WT'CCC. 

CEL files provided the intensities of the various probes on each chip. Initially, 
genotypes were called with the Dynamic Model'** algorithm. Affymetrix subse- 
quently developed an improved algorithm, BRLMM (Bayesian Robust Linear 
Model with Mahalanobis distance classifier'*”'*°). This processes batches of 
samples and uses clustering techniques to call genotypes (the ‘mismatch’ probe 
intensities are not used). In Affymetrix’s standard protocol it is applied in 
batches of 96 samples (plates). This is, of course, a very small sample size and, 
for some SNPs, some clusters will contain few, if any, observations. This might be 
countered by combining information about cluster location over a large number 
of SNPs. 

Throughout, physical coordinates refer to NCBI build-35 of the human gen- 
ome. Alleles are expressed in the forward (+) strand of the reference human 
genome (NCBI build-35). 

Power calculations. We assessed power of the Affymetrix 500K chip using the 
following simulation experiment. Separately for each SNP with MAF >5% in the 
10 HapMap ENCODE regions, we assumed the SNP was causative and simulated 
genotype data at all SNPs in the same region as the putative disease SNP in case- 
control panels of 2,000 cases and 3,000 controls with linkage disequilibrium 
patterns that match those in HapMap. For controls, these simulations were based 
on the imputation algorithm described below (with all genotype data initially set 
to missing in the 3,000 control individuals). For cases, the assumed effect size was 
first used to calculate genotype frequencies in cases (via Bayes’ theorem), and 
genotypes in cases at the putative SNP were then simulated independently from 
theses calculated frequencies. Genotypes at all other SNPs in the region in cases 
were then simulated using the imputation algorithm described below (with all 
data other than the genotypes at the causative SNP initially set to missing in the 
cases). For each such simulated case-control panel, trend tests were performed at 
each of the SNPs in the region that are actually on the Affymetrix chip, and if any 
of these reached the stated P-value threshold the putative disease SNP was 
deemed to be detected, and otherwise to be undetected. Power estimates are 
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then calculated as the proportion of putative disease SNPs with MAFs >5% 
across the HapMap ENCODE regions that are detected at the given P-value 
threshold. There are various approximations here. Actual numbers of cases 
and controls for each disease are slightly smaller than the 3,000:2,000 values used 
in the simulations, but in the other direction, our simulations ignore the pos- 
sibility that a disease SNP might be detected by a genotyped SNP outside its 
ENCODE region. The accuracy reported below of the imputation algorithm in 
imputing genotypes leads us to believe these simulations should be a reasonable 
proxy for real data. Some such simulation is needed if power calculations are to 
take account of the fact that any given putative disease SNP could typically be 
detected by several SNPs on the chip. Exploitation of this simulation approach to 
assess power across different platforms and SNP chips and for different experi- 
mental designs will be reported elsewhere. 

CHIAMO. We developed a new genotype calling algorithm, CHIAMO, which is 
applied after quantile normalization of the data from each sample. A complete 
description is given in Supplementary Information. We briefly summarize some 
features here. Normalized intensities for each genotype were mapped to a two- 
dimensional intensity vector and then we applied CHIAMO, which uses a baye- 
sian hierarchical 4-class mixture model to call genotypes for the whole project. 
We used optimization based on 12 random starts to find the set of parameters (0) 
that maximize the posterior distribution of the model. This parameter set was 
used to calculate the maximum a posteriori estimates of the probabilities of each 
genotype call, Pr (; Data, fi) , where Z,;€ {0, 1, 2, 3}={AA, AB, BB, null} is the 
genotype call for individual jin collection i. All CHIAMO genotype calls analysed 
in this paper were based on an a posteriori probability threshold of 0.9 for 
making a call, following our analysis of the relationship between concordance 
and missing data rates (data not shown). CHIAMO differs from BRLMM in 
several respects: (1) it uses a different transformation of the CEL files to give the 
two-dimensional summary for each individual at an SNP leading to better 
defined clusters; (2) it makes use of mis-match probe signals; (3) it uses a 
different method for fitting the clusters; and (4) it allows the data for all samples 
to be called simultaneously, thus allowing better estimation of cluster location 
and shape parameters, while making allowance for possible differences in these 
parameter values between case/control groups that could arise as a result of 
differences in DNA quality. This is achieved using a hierarchical statistical model 
that specifies the joint distribution of the three cluster centres, their spread, and 
likely allele frequencies (using HapMap) and genotype frequencies (centred on 
Hardy-Weinberg proportions but allowing some variation). 

CHIAMO improved both call rate and accuracy in comparison to BRLMM, 
the current standard Affymetrix calling algorithm (Supplementary Table 3)—it 
roughly halved missing data rates and discordance rates with another platform. 
See Supplementary Information for full details, discussion of some challenges for 
genotype calling, and example cluster plots (Supplementary Figs 10 and 17). 
Quantile-quantile plots. Quantile-quantile (Q-Q) plots are constructed by 
ranking a set of values of a statistic from smallest to largest (the ‘order statistics’) 
and plotting them against their expected values, given the assumption that the 
values have been sampled from a distribution of known theoretical form (in our 
case, the chi-squared distribution, usually on one degree of freedom—for 
example, the distribution of our trend tests under the null hypothesis). 
Deviations from the line of equality indicate either that the theoretical distri- 
bution is incorrect, or that the sample is contaminated with values generated in 
some other manner (for example, by a true association). To aid interpretation of 
such plots we have also calculated 95% ‘concentration bands’ (shaded grey in all 
Q-Q plots). These are formed by calculating, for each order statistic, the 2.5th 
and 97.5th centiles of the distribution of the order statistic under random sam- 
pling and the null hypothesis (for details see ref. 141). We should add two notes 
of caution. First, concentration bands are calculated point by point and, 
although there are very strong correlations between nearby order statistics, the 
probability that a real quantile-quantile plot will stray outside the concentration 
band at some point is some bit larger than 5%. Second, the theoretical chi- 
squared distribution is an approximation, valid for large samples; it is not clear 
whether this approximation continues to hold into the extreme right hand tail of 
the distribution explored in a GWA study (although the indications are that it is 
probably not far wrong for a study as large as ours). 

Data quality control. Of samples for which Affymetrix returned CEL files, a total 
of 809 were excluded from the analysis. A complete breakdown by collection is 
given in Supplementary Table 4. Missing data rate per sample acts as an indicator 
of low DNA quality. Most samples had very low rates of missing data (study-wide 
average 0.00925, standard deviation 0.0187) and we chose to exclude 250 samples 
with >3% missing data across all SNPs (Supplementary Fig. 18, and Supple- 
mentary Tables 4 and 13). We also set empirical thresholds on genome-wide 
heterozygosity (excess heterozygosity in particular may indicate contamination). 
Six samples with >30% heterozygosity and a further three with <23% hetero- 
zygosity were excluded (see Supplementary Fig. 18). We excluded 16 samples 
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with discrepancies between WTCCC information and external identifying 
information (such as genotypes from another experiment, blood type or incor- 
rect disease status). We sought to detect individuals with non-Caucasian ances- 
try using multi-dimensional scaling to provide a two-dimensional projection of 
the data whose axes represent geographic genetic variation. In the interest of 
computational efficiency and to avoid confounding of the multi-dimensional 
scaling by extended linkage disequilibrium we thinned the data to a set of 71,458 
SNPs, within which no pair were correlated with 7° > 0.2. For this set of nearly 
independent SNPs we computed genome-wide average identity by state (sum of 
the number of identical-by-state alleles at each locus divided by twice the number 
of loci) between each pair of individuals in each sample collection along with the 
270 HapMap samples. We converted these identity by-state-relationships to 
distances by subtracting them from 1, and the matrix of pairwise identity by 
state values was used as input to multi-dimensional scaling. The projection onto 
the two multi-dimensional scaling axes is shown in Supplementary Fig. 5. We 
excluded 153 samples that were clearly separate from the main cluster of 
WTCCC individuals. Exclusion of these individuals resulted in a substantial 
reduction in estimates of over-dispersion in test statistic distributions (data 
not shown). We also excluded 295 duplicated (>99% identity) and 86 related 
(86-98% identity) samples from the analysis. 

Filtering out suboptimal markers depends on both the platform and the 
genotype calling algorithm. We experimented with various quality metrics for 
CHIAMO calls, for example, based on the location and/or separation of the 
clusters, but found that the best indicator of a SNP being difficult to call was 
the amount of missing data in its calls: CHIAMO consistently marked many 
individuals missing for SNPs with poorly defined or overlapping clusters, 
whereas it successfully called genotypes for nearly all individuals on high-quality 
SNPs (data not shown). We excluded 26,567 SNPs with a study-wide missing 
data rate >5% (Supplementary Fig. 19), or >1% for SNPs with a study-wide 
MAF <5%. We additionally excluded 4,351 SNPs with Hardy-Weinberg exact 
Pvalue < 5.7 X 10” in the combined set of 2,938 controls, and 93 SNPs with 
Pvalue < 5.7 X 10°” for either a one- or two-degree of freedom test of asso- 
ciation between the two control groups (corresponding to a 1 d.f. chi-squared 
statistic of about 25). See Supplementary Fig. 20 and Fig. 1 respectively for the 
empirical distributions of these statistics used to motivate the thresholds above. 

Overall, we found that the 809 excluded individuals (which represent 4.8% of 
the study samples) accounted for 35.6% of the missing data at non-excluded 
SNPs. In total, 469,557 SNPs passed the quality control filters. 

Supplementary Fig. 20 shows the effect of quality control filters, and visual 
inspection of the cluster plots of SNPs showing apparently strong association, on 
quantile-quantile plots for one disease (T2D, others are similar), and the success 
of these filters in excluding poorly performing SNPs. The figure (panel d) also 
shows the marked effect on the tails of the distribution of test statistics of regions 
of genuine association (for this disease the three regions removed because of 
strong evidence of association have all been independently replicated, see main 
text). The aim in filtering is to exclude poor SNPs but without removing genuine 
associations. No single criterion will do this. In order not to exclude possible 
genuine associations, we chose to apply relatively light quality control filters but 
then to subject all apparently associated SNPs to visual inspection of cluster plots 
(see Supplementary Information). Around 100 cluster plots were assessed per 
disease. 

We used X-chromosome SNPs to check for sex discrepancies with the sample 
files (Supplementary Fig. 21). These were fed back to disease groups for amend- 
ment and verification. The ~80 samples where it was not possible to discern the 
source of the discrepancy were left in the study for analysis, on the grounds that 
mishandling was considered unlikely to have introduced samples with altogether 
different phenotypes. 

DNA quality between cases and controls could result in false-positive associa- 
tions through differential effects on genotype calling''’. DNAs in our study came 
from various sources between, and in some cases within, case and control series, 
but with the combination of centralized sample quality control, simultaneous 
genotype calling with CHIAMO (which explicitly allows for differences between 
collections), and inspection of cluster plots for SNPs with very small P values, our 
study did not experience such difficulties. 

Comparing linkage disequilibrium. Two questions which have been raised 
about the HapMap data are how well it describes linkage disequilibrium in 
populations other than the ones that were sampled, and whether the sample 
sizes in HapMap (60 Caucasian individuals, for example) are adequate to 
describe patterns of linkage disequilibrium. With data on 2,938 controls and 
16,179 individuals in total at around 400,000 polymorphic SNPs, we are well 
placed to address this for the British population. Initial analyses suggest that 
patterns of linkage disequilibrium in our samples are very similar to those in 
HapMap. As an example, Supplementary Fig. 3 compares patterns of linkage 
disequilibrium in HapMap CEU individuals and our 58C sample at SNPs on the 
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Affymetrix chip across 22 1 Mb regions of the genome and they seem almost 
identical. We calculated 7° values directly from the phased haplotypes available in 
HapMap, but using unphased genotype data from our study. Note that visual 
representations of linkage disequilibrium in this form can be very sensitive to 
SNP density so comparisons across regions is difficult without correction for 
SNP density, and direct comparison of linkage disequilibrium patterns at all 
HapMap SNPs with those at the subset of SNPs on the Affymetrix 500K chip 
is not straightforward. 

Geographical variation and population structure. Principal component ana- 
lysis was performed as a two-stage process: we formed a matrix of estimated 
correlations (formally, the inner product measure of similarity) between all pairs 
of individuals, and then computed the eigenvectors and eigenvalues of that 
matrix. We estimated the correlation between two individuals as described 
by". We identified components that reflected genome-wide structure in two 
ways. First, we created two subsets of the data containing SNPs from the odd- 
and even-numbered chromosomes, repeated the PCA on each of these, and 
inspected scatter plots of pairs of components between the two subsets of the 
data. A component which is due to a region of linkage disequilibrium on a 
chromosome (as opposed to genome-wide structure) will appear only when 
analysing the data set containing SNPs from that chromosome. Second, we 
computed the score of every SNP on the components. For a component that 
is due to a region of linkage disequilibrium, there will be a spike of high SNP 
scores only in that region. To minimize the contribution from regions of extens- 
ive strong linkage disequilibrium, the correlation estimates were based on a 
subset of 197,175 SNPs that were spaced at least 0.001 cM apart (HapMap esti- 
mates) and specifically excluded the MHC region. 

To assess the level of over-dispersion in each collection we first created a very 

clean set of data to ameliorate the effects of over-dispersion due to calling 
problems and missing data. In addition to the main filters described above, we 
filtered out all SNPs that had a clear genotype-calling problem revealed by visual 
inspection, SNPs with a study-wide missing data rate >1% and SNPs with study- 
wide minor allele frequency <1%. Around 360,000 SNPs passed these filters. 
Estimates of 2 were calculated using an estimator based on the median test 
statistic’. Estimates of 2 were also calculated from tests that conditioned on 
the scores for each individual along the two estimated principal components 
described above. The tests (1 d.f. and 2 d.f.) were carried out by including the 
scores as additional covariates in a logistic regression model fit. 
Bayes factors. The box in the main text makes the point that understanding the 
strength of evidence conveyed by a particular Pvalue also requires knowledge of 
power. In contrast, the Bayes factor (BF) provides a single measure of the 
strength of the evidence for an association, and we report these in addition to 
Pvalues (Supplementary Table 14). As for power, calculation of Bayes factors 
requires assumptions about effect sizes. The assumptions underlying our calcu- 
lations are given below and in Supplementary Information. 

There is broad agreement between the way in which Pvalues and our Bayes 
factors rank SNPs, except for SNPs with low MAFs (Supplementary Fig. 22). This 
is intuitive: unless one believed, a priori, that rare causative SNPs have substan- 
tially larger effect sizes, there will be reduced power for these SNPs and hence 
weaker evidence for association than for common SNPs with the same Pvalue. 

One perspective on GWAs is that in practice they will be used to prioritize 
SNPs for further study or additional typing. In addition to BFs providing a single 
quantity that can be directly compared between SNPs, it is also straightforward 
for investigators to give different a priori weights to different classes of SNPs, 
such as non-synonymous (ns)SNPs, genic SNPs, SNPs in highly conserved 
regions, or SNPs in linkage disequilibrium with many (or few) other SNPs. 

We now describe calculation of the Bayes factors. We use Mp to denote a 
model of no association, M, for a model with an additive effect on the log-odds 
scale and M, for a general 3 parameter model of association. At each SNP we 
calculate two Bayes factors: one for the additive model versus the null model, 
BF,, and one for the general model versus the null model, BF. That is, 


Pr(Data|M,) Pr(Data| Mp) 
Pr(Data|Mo)’ Pr(Data|Mo)” 


where Pr(Data|M;) = J Pr(Data|0;, M;)Pr(0;|M;)d0, where 9 denotes the para- 
meters for the model. For all 3 models we use a logistic regression model for the 
likelihood Pr(Data|0;, Mj) where the log-odds for individual i is equal to j for 
model Mo, u+yZ; for model M; and w+yl(Z;=1)+ 6(2yI(Z; =2)) for model 
Mj. Z; is the genotype (coded 0, 1 and 2) for individual i and I(Z; =m) is the 
indicator function that individual i has the genotype coded as m. For each model 
we choose the priors on the parameters, Pr(0;|Mj), to reflect our belief about the 
likely effect sizes underlying complex trait loci. 

The parameter y in models M, and Mp is the increase in log-odds of disease for 
every copy of the allele coded as 1, and e’ is the additive model odds ratio. For 
both models we use a N(0, 0.2) prior on y. This prior puts probability 0.31 on 
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odds ratios above 1.2 or below 0.8, and probability 0.02 on odds ratios above 1.5 
or below 0.5. The parameter ju in all three models represents the baseline odds of 
disease. In a case-control design the numbers of cases in the sample have been 
elevated artificially, which will have a large effect on likely values of ju. Our prior 
beliefs about the baseline risk of disease must take this into account. For all three 
models we have used a N(0, 1) for « and have found that the resulting Bayes 
factors are relatively insensitive to choice of priors for this parameter as long as 
the same prior is used for the two models being compared. The parameter # in 
model M,j represents a recessive effect over and above an additive effect. We use a 
N(1, 1) prior for ¢. Combined with the prior on y, this results in a prior prob- 
ability of 0.25 on the odds ratios above 1.5 and below 0.5 for the genotype coded 
as 2. In addition, we note that the evaluation of the Bayes factors will depend on 
the way the alleles at the SNP have been coded 0 and 1. To account for this we 
average over the two possible codings of each SNP with equal weight. A fuller 
description of the priors used can be found in Supplementary Information. 
Sex-differentiated tests. We examined the possibility of differential genetic 
effects in males and females by reapplying the two single-locus analyses (trend 
test and genotypic test) separately in males and females and combining the 
results (simply adding the chi-squared statistics for the male and female analyses, 
and comparing with the 2 df. or 4d.f. null hypothesis; results are shown in 
Supplementary Table 15). We refer to this as a sex-differentiated test. This test 
is sensitive to association that is of a different magnitude and/or direction in the 
two sexes, although it is less powerful than the simple test when the effect size 
does not vary with sex. 

X Chromosome analysis. For several reasons the X chromosome needs to be 
treated differently from the autosomes (note that the Affymetrix chip used does 
not assay the Y chromosome). First, samples sizes and hence power are different 
from the autosomes (only one copy of X in males). Also, because the effective 
population size on the X chromosome is smaller than the autosomes, linkage 
disequilibrium extends further. And unlike the autosomes, there are choices in 
how to implement even single locus analyses: these relate to the relative weight to 
be given to males and females in comparisons between cases and controls. 

For autosomal SNPs, the 1 d.f. trend test statistic is calculated by dividing the 
square of the difference between means of the SNP genotypes (scored 0, 1, 2) 
between cases and controls by an estimate of its variance. The variance estimate 
used is an empirical estimate that does not assume Hardy—Weinberg equilib- 
rium. The numerator can also be represented as the squared difference in allele 
frequencies between cases and controls, as in the allele counting test. At first 
sight, a natural generalization of this test to deal with SNPs on the X chromosome 
would involve comparing allele frequencies, by allele counting, but using a 
variance estimate which does not assume Hardy—Weinberg equilibrium in 
females. However, we took the view that, because most loci on the X chro- 
mosome are subject to X chromosome inactivation, it is more logical to treat 
males as if they were homozygous females. Thus we score female genotypes 0, 1 
or 2 and male genotypes 0 or 2, comparing mean scores of cases and controls as 
before. The variance estimate allows for the different variance of male and female 
contributions and does not assume Hardy—Weinberg equilibrium in females. 

A stratified version of the test is constructed using the same principles by 

which the trend test is extended to the Mantel extension test; a score that con- 
trasts cases and controls is computed for each stratum together with its variance; 
these are then summed over strata. The final test is the squared total score divided 
by the total variance. To extend these tests to a 2 d.f. test, we add a score that 
compares heterozygosity between cases and controls. Clearly, only females con- 
tribute to this component. Results of these analyses of X chromosome SNPs are 
shown in Supplementary Table 16. 
Multilocus analysis. We use (1) the genotype data of this study, (2) the HapMap 
data, and (3) a population genetics model, to simulate genotypes at the HapMap 
SNPs that are not on the Affymetrix 500K chip. Informally, we determine which 
haplotypes are present in each individual in a region, and then use HapMap to 
‘fill in’ these haplotypes at untyped SNPs (see below for details). These ‘in silico’ 
genotypes are then tested for association with the disease as before. This powerful 
multilocus tool for association studies'** has the advantage of using information 
from all markers in linkage disequilibrium with an untyped SNP, but in a way 
that decreases with genetic distance. Our imputation method was applied to 
individuals passing project filters, and used markers which passed the project 
filters and in addition had MAF > 1%. Asa validation we compared our imputed 
genotypes for 58C individuals with genotypes obtained on an Illumina platform 
for 10,180 SNPs that are polymorphic in CEU HapMap samples. At these SNPs, 
for imputed genotypes with posterior call probabilities above 0.95, there was 
98.4% agreement with the Illumina genotypes. 

In our association analyses we imputed genotypes at 2,139,483 HapMap SNPs, 
and tested these for association with each disease using the trend test or the 
genotypic test. We included the results from imputed SNPs in the signal plots 
(Fig. 5) because they are useful in (1) assessing signal strength within a region; (2) 
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providing a wider range of SNPs for follow up; and (3) indicating possible 
locations for the causal variant. For example in the case of TCF7L2 in T2D, there 
is a substantially stronger signal from rs7903146 than for any of the typed SNPs 
(see also Supplementary Fig. 12). 

To be conservative, stringent quality control filters were applied to genomic 
regions where imputed SNPs (but not genotyped SNPs) were responsible for a 
strong signal for association. These were as follows: (1) any such region was 
required to contain more than one imputed SNP showing the required level of 
association with a MAF > 2% and posterior probability for imputed genotypes 
averaged across the SNP >0.95 (empirical studies showed imputation at low 
MAF SNPs more prone to error); (2) all cluster plots for genotyped SNPs within 
0.3 cM (from HapMap Phase II estimated recombination rates) were checked 
and where there was evidence of any mis-calling the region was rejected (the 
major problem with imputation arises around SNPs with genotype calling 
errors); and (3) if there was no genotyped SNP with a P value < 10 * for asso- 
ciation on either trend or genotypic test, the region was rejected. Note that 
accuracy of imputation with these filters applied will be larger than the figure 
of 98.4% reported above. 

We use H={Hj,..., Hy} to denote a set of N known haplotypes where 
H,;={Hj,..., Hix} is an individual haplotype and L is the number of SNP loci. 
In practice, we set H to be the 120 CEU haplotypes estimated as part of the 
HapMap project owing to the expected similarity in haplotype structure between 
the CEU and UK populations. We let G={Gj,..., G;,} denote the genotype data 
on the K individuals in the study where Gj={Gj,..., Gj,} and G,;€ {0, 1, 2, 
missing}. In this setting, the majority of SNPs will have entirely missing geno- 
types, because the Affymetrix 500K chip has approximately 1/6th of the number 
of SNPs in the Phase II HapMap. The missing genotypes are imputed by mod- 
elling the distribution of each individual’s genotype vector G; conditional on the 
known set of haplotypes H, Pr(G;|H). Our model for each individual’s genotype 
vector is a Hidden Markov Model in which the hidden states are a sequence of 
pairs of the N known haplotypes in the set H. That is, 


Pr(G\H)= >> Pr(Gi ZY), 2), H) Pe 2", ZN), 
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our prior probability on how the sequences of copying states change along the 
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close to but not exactly the same as the haplotypes being copied. The precise 
form of these terms (described in ref. 142) are based on an approximate popu- 
lation genetics model that makes direct use of the recently estimated fine-scale 
recombination map across the genome'*”"**. At each of the missing genotypes in 
the study, we use this model to calculate probabilities for the three possible 
genotypes. At each imputed SNP, we used these probabilities to calculate the 
2X3 table of expected genotype counts for cases and controls and used these 
counts to carry out a standard test of association. 

Disease models. To test for deviations from additivity (in log-odds) at a locus we 
fit a logistic regression model using the function glm in the statistical software R 
(http://www.r-project.org/). For each region we considered the most significant 
SNP and compared an additive model to a general 2-d.f. model by fitting a model 
with an additive sub-model nested in a general model. The additive effect was 
modelled by a variable encoded 0, 1, or 2 for the effect at the three genotypes and 
a second term for a general model was included by a variable encoded 1 for 
heterozygotes and 0 otherwise. We rejected an additive model if the second term 
was significant and then compared a dominant or recessive model to a general 
model. For the pairwise interaction analysis, we fixed the marginal model at each 
locus on the basis of the single locus analysis. We compared the two locus model 
with these marginals and no interaction terms with a larger model including 
interactions. This larger interaction model has 1, 2, or 4 additional parameters 
depending on whether both marginal models are additive, one is additive and 
one general, or both general. 

Software. Several software packages were developed within the WTCCC for data 
analysis, data management and simulation studies. We found it necessary to 
normalize the Affymetrix probe intensity data to minimize chip-to-chip vari- 
ability. A C++ program was written to carry out this normalization efficiently. 
To obtain a copy of the software please email Hin-Tak Leung at hin-tak.leung@ 
cimr.cam.ac.uk. 

We developed a new genotype calling algorithm, CHIAMO, implemented 
in C++. CHIAMO uses a hierarchical statistical model, which allows it to 
simultaneously call genotypes at all data samples. To obtain a copy of the soft- 
ware please email J. L. Marchini at marchini@stats.ox.ac.uk. 
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To perform genome-wide association analysis we developed two software 
packages: snpMatrix and SNPTEST. snpMatrix is an R package and is freely 
available from http://www-gene.cimr.cam.ac.uk/clayton/software/. Both quant- 
itative and qualitative phenotypes can by analysed using snpMatrix and flexible 
association testing functions are provided that control for potential confounding 
by quantitative and qualitative covariates. SNPTEST is a standalone C++ pro- 
gram that implements both frequentist tests and bayesian analysis of association 
and allows the user to include quantitative or qualitative covariates. This program 
works directly with the output of CHIAMO and IMPUTE (see below). To obtain a 
copy of the software please email J. L. Marchini at marchini@stats.ox.ac.uk. 

Genotypes at SNPs that are in HapMap but not on the Affymetrix 500K chip 
were imputed using the C++ program IMPUTE, which makes use of genotype 
information at neighbouring SNPs. To obtain a copy of the software please email 
J. L. Marchini at marchini@stats.ox.ac.uk. 
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Developmental reprogramming after 
chromosome transfer into mitotic 


mouse zygotes 


Dieter Egli’, Jacqueline Rosains', Garrett Birkhoff’ & Kevin Eggan' 


Until now, animal cloning and the production of embryonic stem cell lines by somatic cell nuclear transfer have relied on 
introducing nuclei into meiotic oocytes. In contrast, attempts at somatic cell nuclear transfer into fertilized interphase 
zygotes have failed. As a result, it has generally been assumed that unfertilized human oocytes will be required for the 
generation of tailored human embryonic stem cell lines from patients by somatic cell nuclear transfer. Here we report, 
however, that, unlike interphase zygotes, mouse zygotes temporarily arrested in mitosis can support somatic cell 
reprogramming, the production of embryonic stem cell lines and the full-term development of cloned animals. Thus, human 
zygotes and perhaps human embryonic blastomeres may be useful supplements to human oocytes for the creation of 


patient-derived human embryonic stem cells. 


In the first successful mouse nuclear transfer experiments, McGrath 
and Solter' exchanged the pronuclei of two fertilized zygotes. The 
resulting embryos developed in vitro into blastocysts and in vivo into 
mice. However, when the authors transferred nuclei from cells at later 
developmental stages into zygotes, the embryos failed to develop””. 
On the basis of their initial technical success but ultimate failure to 
reprogramme more differentiated nuclei, they concluded that mam- 
malian cloning would be impossible’. 

The mammalian cloning field was reinvigorated when it was 
demonstrated that, unlike the zygote, the cytoplasm of the unfertilized 
oocyte could support reprogramming after nuclear transfer. This real- 
ization allowed the generation of sheep’, rabbits’, pigs® and mice’ from 
embryonic blastomeres, the production of sheep from cultured 
embryonic fibroblasts’, and ultimately the cloning of Dolly’. 
Numerous mammalian species have now been cloned from adult cells 
by somatic cell nuclear transfer into unfertilized oocytes’? ™. 

More recently, the ability of mouse and bovine zygotes to support 
nuclear reprogramming has been reinvestigated'>’*. In these experi- 
ments, developmental potential after nuclear transfer was highest when 
unfertilized oocytes were used and decreased rapidly after either fert- 
ilization or artificial activation. Together, these studies in oocytes and 
zygotes””°'>!° suggest that activities crucial for nuclear transfer and/or 
reprogramming are lost to the oocyte cytoplasm after fertilization. 

Conclusions drawn from animal experiments have also informed 
thinking about efforts to produce human embryonic stem cell lines 
by somatic cell nuclear transfer, and it is therefore generally accepted 
that unfertilized human oocytes will be needed for this procedure. 
Unfortunately, human oocytes are difficult to obtain and their pro- 
curement raises medical, logistical and ethical questions, primarily 
surrounding the participation of women as oocyte donors'””*. 

We have revisited nuclear transfer into zygotes and considered the 
possibility that factors required for either reprogramming or embry- 
onic development'*~’, present in the cytoplasm of unfertilized mei- 
otic oocytes, become sequestered in the pronuclei of zygotes (Fig. 1a). 
Such partitioning could explain the failure of enucleated interphase 
zygotes to support development after nuclear transfer”'’, because 


removal of the pronuclei during enucleation would deplete these 
factors and prevent development. In contrast, removal of the con- 
densed chromosomes from a metaphase II, meiotic egg would not do 
so. If this model is correct, breakdown of the pronuclear envelope at 
entry into the first embryonic mitosis might liberate the critical factor 
or factors into the cytoplasm, once again allowing chromosome 
removal without factor depletion (Fig. la). 

To test whether the cytoplasm of a mitotic zygote could indeed 
support nuclear reprogramming, we reversibly arrested mouse 
zygotes in mitosis, removed their chromosomes and replaced them 
with the chromosomes from either embryonic or somatic donor cells. 
We found that these reconstructed zygotes could be used to generate 
cloned animals and embryonic stem (ES) cell lines. 


Reversible mitotic arrest of mouse zygotes 

To synchronize zygotes in mitosis, we transferred interphase zygotes, 
containing two distinct pronuclei (Fig. 1c), into the microtubule- 
depolymerizing drug nocodazole. The spindle assembly checkpoint 
detects defects in spindle structure and delays chromosome segrega- 
tion until they are corrected”. Therefore, spindle disruption with 
nocodazole results in mitotic arrest”. At entry into mitosis, we 
observed breakdown of the pronuclear envelope and condensation 
of both maternal and paternal genomes (Fig. 1d). In the presence of 
nocodazole, condensed chromosomes were not assembled onto a 
spindle and zygotes did not proceed through mitosis. Although 
nocodazole stably and reversibly arrested zygotes (Supplementary 
Tables 1 and 2, Supplementary Movie 1), chromosome position 
could not be observed (Fig. 1d, top panel) without the DNA dye 
Hoechst 33342 and illumination with ultraviolet light, which 
compromised later development. We therefore could not routinely 
remove the chromosomes, a key step in nuclear transplantation. 

To allow spindle polymerization and the determination of 
chromosome position while preventing progression out of mitosis, 
we transferred the mitotic zygotes into the proteasome inhibitor 
MG-132. The metaphase-to-anaphase transition requires degrada- 
tion of the cyclin B subunit of the maturation promoting factor by 
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the proteasome™. In the presence of MG-132, the metaphase-to- 
anaphase transition and exit from mitosis are blocked”. We found 
that short-term treatment of zygotes with MG-132 led to a reversible 
arrest in metaphase (Supplementary Tables 1 and 2) and spindle 
polymerization, allowing the observation of chromosome location 
by light microscopy or optical birefringence (Fig. le). 

To determine whether the chromosomes of these arrested zygotes 
could be reliably removed, we incubated them with Hoechst 33342 
and observed DNA content after spindle removal (Fig. 2a—d). When 
the spindle was removed by micromanipulation, we found that in all 
cases the chromosomes were also removed (n = 120/120; Fig. 2c, d). 


Zygotes in mitosis support nuclear reprogramming 


To determine whether zygotes ‘enucleated’ by removal of the spindle 
in mitosis had a greater capacity to support reprogramming than 
zygotes enucleated in interphase, we compared the results of gnome 
transfer from embryonic donor cells into these two recipients. 
When pronuclei from interphase zygotes were replaced by micro- 
injection with pronuclei from another zygote, the resulting embryos 
developed to the blastocyst stage and into mice (Table 1, and 
Supplementary Fig. la). Nuclei from two-cell-stage embryos were 
also able to direct development to the blastocyst stage after transfer 
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Figure 1| The first embryonic cell cycle. a, Diagram of the first cell cycle. In 
the metaphase oocyte, condensed chromosomes (blue) are aligned on a 
metaphase plate, whereas nuclear factors that might be required for 
reprogramming or development (green) are dispersed throughout the 
cytoplasm. After fertilization, the parental genomes and required nuclear 
factors are sequestered in the pronuclei. After entry into mitosis and nuclear 
envelope break down (NEBD), the factors should be released into the 
cytoplasm. b, Oocyte arrested in metaphase of meiosis II, 15 h after injection 
with hCG. c, Zygote in interphase with two pronuclei, 28 h after injection 
with hCG. d, Zygote arrested in prometaphase of mitosis by nocodazole 
(noc), 30h after injection with hCG. Noc, nocodazole. e, Zygote entering 
metaphase in the presence of MG-132, 30.5 h after injection with hCG. As in 
the unfertilized oocyte, a prominent microtubule spindle could be seen 
under either Hoffman modulation contrast (HMC) or optical birefringence, 
enabling chromosome removal. 
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into interphase zygotes, but at a lower efficiency (Table 1). However, 
when interphase nuclei from eight-cell-stage blastomeres were used, 
development of the embryos was arrested in the first two cleavage 
divisions (Table 1, and Supplementary Fig. 1c). These observations 
agree with historical results**® and suggest that the cytoplasm of 
zygotes enucleated in interphase cannot routinely reprogramme 
the nuclei of cells that have advanced to the eight-cell stage. 

To test whether mitotic zygotes had an enhanced ability to repro- 
gramme more differentiated cells, we removed their chromosomes, 
microinjected mitotic chromosomes from nocodazole-arrested donor 
zygote, two-cell-stage or eight-cell-stage embryos and monitored 
development after removal of the drug. In contrast to nuclear transfer 
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Figure 2 | Chromosome transfer into zygotes arrested in mitosis. 

a, Diagram outlining the method of chromosome transfer into zygotes 
arrested in mitosis. b, Arrested zygotes before spindle and chromosome 
removal, 10 min after the shift from nocodazole to MG-132. HMC, Hoffman 
modulation contrast. c, Removal of the zygote spindle and chromosomes by 
micromanipulation. d, A large group of zygotes after spindle removal, all 
lacking chromosomes. e, Piezo-actuated injection of a nocodazole-arrested 
ES cell into a mitotic zygote. f, ES cell chromosomes in the zygote 
immediately after the transfer. g, At 80 min after transfer, a new spindle has 
formed and the chromosomes have aligned in a new metaphase plate. MB, 
microtubule birefringence. h, i, Progression through the first mitosis after ES 
cell chromosome transfer. h, From left to right: prometaphase/metaphase, 
anaphase, and telophase/cytokinesis, shown under microtubule 
birefringence. Times after transfer are shown. i, Equal chromosome 
segregation into the two daughter blastomeres at 120 min after transfer. 


©2007 Nature Publishing Group 


NATURE|Vol 447|7 June 2007 


ARTICLES 


Table 1| Developmental potency of zygotes reconstructed with genomes of different developmental and cell cycle stages 


Recipient (cell Donor (cell Method No. No. cleaved (% Morulae, Blastocysts, Morulae and No. of embryos No. of Pups 
cycle stage) cycle stage) manipulated of manipulated) day 3.5 day 3.5 blastocysts (% transferred pregnant 

of cleaved) (recipients) recipients 
Zyg. (M)* Zyg. (M) n 46 21 (46) 0 17+ 81 
Zyg. (M)t Zyg.(M) nj. 93 66 (71) 36 17 80 48 (4) 4 16 
Zyg. (It Zyg. (1) Elec. 20 7 (85) 2 12 82 5 (1) x 1 
Zyg. (M) 2-cell (M) nj. 90 70 (78) 19 51 100 65 (5) 4 12 
Zyg. (1) 2-cell (1) nj. 42 30 (70) 10 3 43 8 (1) 0 0 
Zyg. (1) 2-cell (I) Elec. 13 12 (92) 3t 3t 50 
Zyg. (M) 8-cell (M) nj 30 3 (43) 2 7 69 9 (2) 2 2 
Zyg. (1) 8-cell (1) n 30 16 (53) 0 0 0) 
Zyg. (M) ESC (M) n 1,093 323 (30) 92 109 62 174 (11) 4 9 
Zyg. (I) ESC (1) nj. 47 6 (34) 0 0 0) 
Zyg. (I) ESC (M) nj. 55 35 (64) 0 6) 0) 
Ooc. (MII)? ESC (I) n 275 2128 34 73 (9) 9 
Zyg. (M) Fib. (M) n 775 231 (30) 72t 26t 42 
ogg Zyg. (M) ESC (M) n 23 5 (65) 6 8 93 
goa Zyg.(M) ~— ESC (M) n 75 27 (37) 7 11 66 
Zyg,., zygote; 900", dispermic; O’9Q, digynic; ESC, embryonic stem cell; o0c., oocyte; Fib, somatic cell fibroblast; M, mitosis; MII, metaphase of meiosis II; |, interphase; elec., electrofusion; inj., direct injection. 
*The genome was transferred back into the same zygote. 
+ The numbers of morulae and blastocysts were scored on day 4. The table includes results from all, even initial, experiments. 
= The genome was exchanged between different zygotes. 


§ The number of embryos with interphase nuclei was taken as 100% for the calculation of the percentage that reached the morula and blastocyst stages. 


in interphase, chromosome transfer in mitosis resulted in efficient 
development to the blastocyst stage in vitro and to adulthood after 
embryo transfer for all three of the donor cell types, including cells 
from the eight-cell embryo (Table 1, and Supplementary Fig. le-g). 


Cloned mice generated by zygote chromosome transfer 


To understand chromosome dynamics after chromosome transfer 
and to try producing cloned mice from cultured cell lines, we 
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performed experiments using ES cells as chromosome donors. To 
distinguish between zygote and donor cell chromosomes, we used 
donor ES cells that expressed a doxycycline (dox)-inducible histone 
H2B-cherry, a red fluorescent fusion protein (Fig. 3b, and Supple- 
mentary Fig. 2). 

Donor cells were arrested in mitosis with nocodazole (Fig. 2e, f) 
until microinjection, and consequently their chromosomes were 
transferred into the mitotic zygotes without a spindle (Fig. 3a). 


Transfer 


\ in utero 


Blastocyst 
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Clone Non-clone (WT) 


Figure 3 | Developmental potential in vitro and in vivo after chromosome 
transfer from ES cells into mitotic zygotes. a, Experimental outline. b, H2B- 
cherry donor ES cells with and without induction of the transgene by 
doxycycline (dox). ¢, d, Development of preimplantation embryos after 
zygote chromosome transfer, from the two-cell stage (¢, left) to the blastocyst 
stage (d). Times after chromosome transfer are shown. e, Developmental 
arrest at the two-cell stage after the transfer of an ES cell genome into an 
enucleated interphase zygote. f, Cdx2 expression in trophectoderm cells but 
not in inner cell mass cells (arrow) of a blastocyst-stage embryo after 
chromosome transfer. g, Cloned pups produced by chromosome transfer, 
and control pups after caesarian section. Their placentas are also shown. Note 


the large placentas of clones but not the controls, and the dark bluish skin 
colour of clones as a result of respiratory failure. h, Primary culture of skin 
cells from a cloned pup. i, Genotyping by PCR of a cloned pup derived from 
the transgenic donor cells. Size markers are shown at left and right. 

Puro, puromycin; neo, neomycin; IL-2, interleukin-2; WT, wild type. j, Bar 
diagram showing weights (means + s.d.) of pups (right) and their placentas 
(left) produced by chromosome transfer from ES cells, mitotic spindles of 
either another zygote or of a two-cell-stage blastomere, and of controls. 
Numbers below the bars are numbers of individuals. N, nocodazole treatment 
28-36 h after injection with hCG; M, treatment with 1 14M MG-132 
36.0-36.5 h after injection with hCG. IVC, in vitro cultured. 
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After release from MG-132, a new spindle rapidly nucleated around 
the ES cell chromosomes. Chromosome segregation and cytokinesis 
were often observed within 90-150 min (Fig. 2g-i). Manipulated 
embryos cleaved at a frequency of 30%, and 60% of those that cleaved 
developed to the morula and blastocyst stages (Table 1 and Fig. 3c, d). 
This efficiency was comparable to that previously observed in nuclear 
transfer experiments using interphase ES cell donor nuclei and unfer- 
tilized oocytes”. In contrast, when zygotes enucleated during inter- 
phase were used as recipients for either interphase ES cell donor 
nuclei or mitotic ES cell chromosomes, the resulting embryos 
arrested at the one-cell or two-cell stage (Table 1, Fig. 3e, and Sup- 
plementary Fig. 1d). 

Blastocysts derived by ES cell chromosome transfer into mitotic 
zygotes expressed H2B-cherry, demonstrating their donor cell origin 
(Fig. 3d). These blastocysts also recapitulated the normal expression 
pattern of the Cdx2 protein, which is expressed in the trophectoderm 
but not in ES cells**’’, confirming that reprogramming of gene 
expression had accompanied preimplantation development (Fig. 3f). 

We assessed the developmental potential of these ES-cell-derived 
blastocysts in vivo by embryo transfer to pseudopregnant recipients 
(Table 1 and Fig. 3). From a total of 174 transferred morulae and 
blastocysts, 9 living pups were recovered after caesarean section at 
embryonic day 19.5 (Fig. 3g). Seven of the nine pups failed to respire 
normally and died. Two pups established regular respiration, but one 
was humanely killed because ofa midline closure defect and the other 
was rejected by its foster mother. We cultured skin cells from these 
animals and either assessed red fluorescence after induction with 
doxycycline or genotyped the cells by means of the polymerase chain 
reaction (PCR). All nine pups were transgenic (Fig. 3h, i), dem- 
onstrating that they were cloned animals. 

Ina manner reminiscent of the overgrowth phenotype observed in 
other cloned animals”, these cloned newborns had placentas that 
were markedly larger than those of controls (Fig. 3j). In contrast, 
mice derived by chromosome transfer from zygote, two-cell-stage 
and eight-cell-stage embryo donor cells had placental weights within 
the normal range. These blastomere-derived animals also regularly 
established normal respiration and survived to adulthood (Fig. 3), 
and Supplementary Fig. 1). 


Zygotes can reprogramme adult somatic chromosomes 


The experiments described so far have demonstrated that chromo- 
somes can be successfully transferred into zygotes arrested in mitosis, 
allowing the derivation of cloned mice. However, if zygotes are to be 
useful recipient cytoplasts for producing genetically tailored human 
ES cell lines, they must be capable of reprogramming the genome of 
an adult somatic cell. We therefore performed chromosome transfer 
with mitotically arrested, adult tail-tip cells that carried a green fluor- 
escent protein (GFP) transgene under the control of the Oct3/4 pro- 
moter ( Oct4::GFP)*'. Oct3/4is expressed in early pluripotent cells but 
not in somatic tissues” and therefore the reactivation of Oct3/4 is a 
measure of successful reprogramming*®’. Embryos into which chro- 
mosomes from somatic cells had been transferred cleaved at an effi- 
ciency (30%) similar to that observed with ES cell chromosome 
donors but developed to the morula and blastocyst stages at a lower 
efficiency (42% of cleaved embryos; Table 1). Consistent with suc- 
cessful nuclear reprogramming, we observed that Oct4::GFP express- 
ion became visible in late-cleavage-stage embryos and was strongly 
expressed at the blastocyst stage (Supplementary Fig. 3). 


ES cells derived by chromosome transfer into zygotes 


We next sought to determine whether fertilized zygotes could be used 
in chromosome transfer experiments to produce ES cell lines from 
both embryonic (see Supplementary Fig. 4 for embryonic donor cell 
results) and somatic (Fig. 4) donor cells. As somatic donor cells we 
chose nocodazole-arrested skin fibroblasts derived from mice car- 
rying the inducible H2B-cherry transgene (Fig. 4a, b). 
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After transfer of skin fibroblast chromosomes into mitotic zygotes, 
embryos were allowed to develop to the blastocyst stage and were 
then used for ES cell derivation (Table 1) (Fig. 4c). We transferred 21 
blastocysts into culture for ES cell derivation; 14 of these attached to 
the mouse embryonic fibroblast feeder layer and 8 developed out- 
growths of the inner cell mass (Fig. 4d). Six of the outgrowths gave 
rise to cell lines that grew in phase-bright colonies that fluoresced red 
when exposed to doxycycline (Fig. 4e). 

These cell lines were immunoreactive to antibodies specific for the 
Oct3/4 protein and the stage-specific embryonic antigen-1, which are 
expressed in ES cells but not in skin fibroblasts (Fig. 4f, g). Karyotyping 
revealed the presence of a normal 40XY mouse karyotype for five of 
the cell lines (Fig. 4h). We injected cells from one of these cell lines into 
non-agouti blastocysts and transferred the blastocysts to recipient 
mice whose water was supplemented with doxycycline. At embryonic 
day 11.5 we recovered ten embryos and found that five of them pos- 
sessed a high degree of chimaerism as judged by red fluorescence 
(Fig. 41). When allowed to develop to term, postnatal chimaeric mice 
also displayed a high degree of agouti coat-colour chimaerism, sug- 
gesting that they were largely derived from the injected cells (Fig. 4j, 
and Supplementary Table 4). To test for germline chimaerism, we 
crossed a high-contribution chimaeric male to a non-agouti female 


a Somatic cell 
chromosome transfer 


tetO-H2B-cherry _ 
skin cells «7 / 
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wy 


In vitro culture 
to blastocyst 


Figure 4 | Derivation of ES cell lines from somatic-cell chromosome 
transfer blastocysts. a, Diagram of zygote somatic-cell chromosome 
transfer (SCCT) and derivation of SCCT ES cells. b, Somatic donor cell line 
with fibroblast morphology. BF, bright field. c, Zygote SCCT blastocyst, 72 h 
after transfer. d, Outgrowth of zygote SCCT inner cell mass 16 days after 
plating. e, Zygote SCCT ES cell culture. f, g, Expression of the pluripotency 
marker gene Oct4 (g) and the embryonic antigen SSEA-1 (f) in zygote SCCT 
ES cells, detected by immunostaining. h, Mitotic chromosomal spread of a 
zygote SCCT ES cell line with a normal set of 40 mouse chromosomes. 

i, Chimaera analysis after injection of zygote SCCT ES cells into blastocysts. 
An embryonic chimaera (right) and a non-transgenic sibling (left). j, A male 
chimaera (right) and its germline offspring (left). 
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and observed that 5 of 11 pups displayed agouti coat colour, dem- 
onstrating that these cells had contributed to the germ line and were 
genuine ES cells (Fig. 4j). 


Discussion and relevance 


Our observation that meiotic oocytes and mitotic zygotes can repro- 
gramme somatic genomes, whereas zygotes enucleated in interphase 
cannot, suggests that the ability of the embryonic cytoplasm to sup- 
port reprogramming fluctuates with the cell cycle. It is possible that 
some reprogramming factors are destroyed and renewed with each 
cell cycle, but it seems more likely that one or more factors critical for 
embryonic development or reprogramming localize to the pronuclei 
during interphase (Fig. 1a). Thus, removal of the pronuclei during 
enucleation removes the relevant factor or factors. In contrast, 
during meiosis and mitosis, many nuclear factors become dispersed 
throughout the cytoplasm, and the condensed chromosomes can be 
removed from the oocyte or zygote without depleting the factors’ 
activity. Consistent with our model is the recent demonstration that 
if the interphase pronuclei of the zygote are punctured before 
removal of the chromatin, reprogramming activity in the interphase 
zygote can be stimulated™. In addition, when the germinal vesicles of 
immature oocytes are removed before nuclear envelope breakdown 
and meiotic metaphase arrest, the oocytes become unsuitable as reci- 
pients for nuclear transfer*’. Reprogramming factors in ES cells***’ 
may also reside in the nucleus, because enucleated cytoplasts gener- 
ated from ES cells have so far failed to reprogramme somatic cells’’. 

The cloning of animals has relied almost exclusively on unfertilized 
oocytes and therefore has required artificial activation procedures 
that poorly mimic fertilization. As a result it has been difficult to 
determine whether non-physiological oocyte activation contributes 
to some abnormalities observed in cloned animals. Experiments 
transferring nuclei into fertilized cow oocytes at telophase of meiosis 
suggest that artificial activation may indeed contribute to poor clone 
development'®. In addition, when the pronuclei of one-cell-stage 
embryos generated by oocyte nuclear transfer and artificial activation 
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Figure 5 | Aneuploid zygotes with more than two pronuclei can be used as 
recipients for chromosome transfer. a, Diagram of an in vitro fertilization 
(IVF) reaction without a zona pellucida. The increased access to sperm 
results in a high frequency of polyspermic zygotes. b, Dispermic zygote in 
interphase, with two paternal pronuclei and a single maternal pronucleus 
(arrows); 6h after IVF. c, Dispermic zygote at prometaphase, after 
pronuclear envelope breakdown and chromosome condensation with three 
groups of haploid genomes (arrowheads); 20h after IVF. d, Removal of the 
triploid mitotic genome, 10 min after treatment with nocodazole. 

e-g, Clones at the four-cell (e), morula (f) and blastocyst (g) stages derived 
after chromosome transfer of an ES cell genome into a polyspermic zygote. 
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are serially transferred into an enucleated fertilized zygote, develop- 
ment is improved’***, again suggesting that normal fertilization 
improves the ability of the zygote cytoplasm to support the develop- 
ment of NT embryos. 

Chromosome transfer into mitotic zygotes bypasses the need for 
artificial activation, because the sperm has already initiated develop- 
ment. We found that mice cloned from ES cells by chromosome trans- 
fer into zygotes displayed at least two phenotypes commonly observed 
in other cloned animals: neonatal respiratory failure and placental 
overgrowth”. These phenotypes, common to many cloned animals, 
therefore do not result solely from artificial activation of oocytes. In 
addition, neonatal mice derived by transferring chromosomes from 
blastomeres into zygotes did not display these phenotypes, suggesting 
that they do not arise solely from this mechanical procedure but prob- 
ably result from failures in nuclear reprogramming. 

Our experiments show that reprogramming activities are not per- 
manently lost from the egg cytoplasm after fertilization, which is 
relevant to the continuing efforts to produce human ES cell lines 
by nuclear transfer from somatic cells. The most readily available 
human oocytes are aged ones that failed to be fertilized during in 
vitro fertilization reactions. Mouse oocytes aged in this way have 
decreased developmental competence and their human counterparts 
have so far been unsuitable recipients for nuclear transfer? ’. 
Although fresh unfertilized human oocytes would be preferable, 
there are substantial logistical, medical and societal difficulties in 
obtaining sufficient numbers. In contrast to fresh unfertilized 
oocytes, which generally do not exist in excess of clinical need, nor- 
mal fertilized zygotes are frozen with some regularity. In many cases 
these frozen zygotes are discarded by couples who have completed 
their assisted reproduction treatment’ and they could instead be 
donated for stem cell research. 

More significantly, 3-5% of all human zygotes are found to con- 
tain an abnormal number of pronuclei after in vitro fertilization**. 
We estimate that these aneuploid zygotes number in the tens of 
thousands each year in the United States’. These zygotes are 
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Times after transfer are shown. h, Diagram of fertilization with a failure to 
extrude the second polar body. Cyto B, cytochalasin B. i, Digynic zygote in 
interphase, with three pronuclei (arrows); 10h after IVF. j, Nocodazole- 
arrested digynic zygote at prometaphase, with three haploid genomes 
(arrowheads); 20h after IVF. k, Mitotic chromosomes of a digynic zygote 
assembling in a single spindle, 10 min after treatment with nocodazole. 

I, Clone at the blastocyst stage derived by chromosome transfer of an H2B- 
cherry transgenic ES cell genome into a digynic zygote; 76 h after transfer. 
m, H2B-cherry transgenic ES cell line derived after chromosome transfer 
into a triploid digynic zygote. n, A mitotic chromosomal spread of the same 
cell line, showing a diploid karyotype. 
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excluded from clinical use at the one-cell stage because their abnor- 
mal ploidy is incompatible with normal postimplantation develop- 
ment; they, too, could be donated for research without interfering 
with that couple’s reproductive efforts. Human polyspermic zygotes 
routinely undergo cleavage division**** and might therefore be 
arrested in mitosis and the zygotic chromosomes removed by using 
the methods we have described. After introducing the correct num- 
ber of human somatic chromosomes from a mitotic donor cell, our 
results suggest that reconstructed zygotes might support develop- 
ment to the blastocyst stage and human ES cell derivation. 

We have tested whether aneuploid mouse zygotes resulting from 
polyspermy and failed polar body extrusion can support nuclear 
reprogramming by inducing these conditions in vitro and using the 
resulting zygotes as recipients for mitotic chromosome transfer 
(Fig. 5). We found in both cases that, on entry into mitosis, the 
pronuclei broke down and the supernumerary chromosome sets 
congregated on a single spindle in the centre of the zygote. When 
this spindle was removed, the chromosomes were also removed 
(Fig. 5d). After transfer of ES cell chromosomes from mitotic donor 
cells into these cytoplasts they developed to the blastocyst stage 
(Table 1) and could be used to derive ES cell lines with a normal 
karyotype (Fig. 5n). 

Thus, our results provide several previously unexplored and tech- 
nically feasible avenues towards the production of “genetically tai- 
lored’ human ES cell lines that are not constrained by the limitations 
of oocyte donation for research. 

The finding that oocytes, zygotes and ES cells each harbour repro- 
gramming activities suggests that a continuum of activity may extend 
from the unfertilized egg through cells of the preimplantation 
embryo to the ES cells derived from them. Cells from one-cell, 
two-cell, four-cell or eight-cell embryos might therefore all be useful 
recipients for the chromosome transfer methods we have described. 
Similarly, it may be possible to generate cytoplasts from ES cells that 
have been arrested in mitosis and test whether they can support 
nuclear reprogramming. If cells from discarded cleavage-stage 
human preimplantation embryos could be used as recipients for 
chromosome transfer, or if ES cells arrested in mitosis could be used 
to generate cytoplasts with reprogramming activity, it would sub- 
stantially advance efforts to produce human ES cell lines for disease 
modelling and transplantation medicine. 


METHODS SUMMARY 

Chromosome transfer into mouse zygotes in mitosis. Zygotes were retrieved 
from mated females 20-24 h after injection with human chorionic gonadotropin 
(hCG) and placed into KSOM embryo culture medium with 0.1 pg ml’ noco- 
dazole 25-29h after injection with hCG. Small groups of zygotes arrested in 
mitosis were washed and transferred into KSOM with 2 uM MG-132 for about 
10 min. Manipulations (Supplementary Movie 2) were conducted on a heated 
stage in the presence of 5 ug ml! cytochalasin B, within 10-20 min while zygotes 
were still in prometaphase of mitosis. Mitotic donor cells were obtained by 
mitotic shake-off from a cell culture grown for 6h in the presence of 0.1 ug ml! 
nocodazole. Cells were treated with trypsin and then mixed with 1% polyvinyl- 
pyrrolidone containing 0.1 pg ml ' nocodazole. A detailed description of mate- 
rials and methods is provided in Supplementary Information. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Mice and cell lines. BDF1 mice used as zygote donors were obtained from 
Charles River Laboratories, and B6jcBA-Tg(Pou5fl-EGFP)2Mnn/J mice origin- 
ating from the lab of Hans Schoeler*' were obtained from Jackson Laboratories. 
KH2 cells as well as plasmids pBS31tetOpgkATGfrt and pCAGGS-Flpe-pur for 
the construction of KH2-H2B-cherry ES cells (approx. passage 40) were 
obtained from Konrad Hochedlinger?. The H2B-cherry fusion protein was 
constructed by recombinogenic PCR with the primers 5'’-ACA CCA GCG 
CTA AGG ATC CAC CGG TCG CCA TGG TGA GCA AGG GCG AG-3’ and 
5'-AGC CTT TAA GCC TGC CCA GAA GAC-3’, and 5’-GCG ACC GGT GGA 
TCC TTA GCG CTG GTG TAC TTG GTG ACG GCC TTA GTA CC-3’ and 5'- 
CAC CGT CGA CGG TAC CGC CAC CA-3’; it was then cloned into the EcoRI 
site of pBS31tetOpgkATGfrt. Somatic donor cells from skin and tail were cul- 
tured as described previously’. To induce the tetO-H2B-cherry transgene, dox- 
ycycline (catalogue no. D9891; Sigma) was added to the drinking water of mice at 
a concentration of 1 mg ml! and to cell cultures at 1 ug ml!. 

Chromosome transfer into zygotes arrested in mitosis. BDF1 females 6-8 
weeks old were superovulated as described previously’' and mated to BDF1 
males. Zygotes were retrieved from the oviducts of plugged females 20-24 h after 
injection with hCG, and placed into KSOM medium. Embryos were transferred 
into KSOM containing 0.1 pg ml! nocodazole (catalogue no. M1404; Sigma) 
25-29h after injection with hCG. Zygotes arrested in mitosis were washed 
through one to three drops of KSOM to remove residual nocodazole and then 
transferred into KSOM (Chemicon) with 1-2 uM MG-132 for 5-20 min and 
then, for manipulations, transferred on the stage into oil-covered droplets of 
HCZB supplemented with 5pgml~! cytochalasin B (catalogue no. C6762; 
Sigma) and 1-2 1M MG- 132 (catalogue no. 474790; Calbiochem). In some cases, 
MG-132 was completely omitted and replaced with 0.0025-0.03 pg ml” ' noco- 
dazole during the manipulations to delay mitotic progression of the zygote while 
not dissociating the spindle (Supplementary Fig. 5). This modification resulted 
in a smaller spindle volume and a more regular cleaveage of clones. Mitotic 
donor cells were obtained after culturing cells with 0.1 pg ml ' nocodazole for 
6-12h. Cells were obtained by mitotic shake-off from the culture dish, treated 
with trypsin and then mixed with 1-7% polyvinylpyrrolidone containing 
0.1 pg ml! nocodazole. Mitotic cells were selected under the microscope and 
then transferred with a 10-11m needle for mitotic ES cells and with a 12-14 um 
needle (Humagen) for mitotic somatic cells. Removal of the spindle chro- 
mosome complex and transfer of broken cells were done in one step, taking care 
that the cell was deposited approximately in the middle of the zygote 
(Supplementary Movie 2). The average survival rate that we obtained was 85%. 

All manipulations were done on a heated stage (37 °C) of a Nikon microscope 
equipped with Hoffman modulation contrast optics; they were completed 
within a 5—45-min window after the release from nocodazole. The microscope 
was equipped with a Xyclone laser (Hamilton Thorne Biosciences) for zona 
drilling and a piezo unit (PrimeTech) for breaking the zygotic plasma mem- 
brane. After manipulations had been completed, chromosome transfer embryos 
and control embryos were cultured in droplets of KSOM covered with mineral 
oil. 

Mitotic spindles of zygotes and blastomeres were transferred with needles 13 
or 141m in diameter with the method as described for ES cells. Manipulation of 
interphase zygotes was performed 24—26h after injection with hCG with the 
methods described above, with the addition of 0.1-0.3 pg ml! nocodazole to 
the manipulation medium and the use of a 14-11m needle for enucleation and 
transfer. Nuclei of two-cell-stage embryos were transferred by direct injection or 
by electrofusion with an LF101 electrofusion apparatus with two direct-current 
pulses of 1.8kVcm7~! in medium containing 0.26mM mannitol, 0.1 mM 
MgSO,, 0.5mM HEPES and 0.05% bovine serum albumin. Images of optical 
birefringence were taken with the Oosight system (Cambridge Research and 
Instrumentation). 

In vitro fertilization (IVF) was performed as described". In brief, oocytes were 
collected from hormone-stimulated females 14h after injection with hCG. 
Sperm was obtained from the cauda epididymis of a non-transgenic male and 
incubated at 37 °C for 30-60 min in HTF medium (IVF Online) supplemented 
with 5% fetal bovine serum before the addition of the oocytes. The presence of 
pronuclei was scored 10h after the initiation of the IVF reaction, and zygotes 
with more than two pronuclei were selected for chromosome transfer. Unlike in 
humans, in whom only the zona pellucida acts as a barrier to polyspermy, in the 
mouse both the oolemma and the plasma membrane inhibit polyspermy™. In 
vitro, the mouse zona pellucida acts as an efficient barrier to sperm even without 
any fertilization. To induce high levels of fertilization, we removed the zona 
pellucida either partly with a laser pulse or completely with acidic Tyrode’s 
solution. Dispermic and digynic zygotes normally entered the first mitosis in 
the presence of nocodazole, and the removal of nocodazole (10 min) resulted in 
the assembly of all chromosomes ina single spindle that could be removed as ina 


nature 


normally fertilized zygote (Fig. 5d, k). Also in humans, chromosomes of dis- 
permic embryos assemble in a single spindle in the centre of the egg**. To induce 
digynic zygotes experimentally in the mouse, cytochalasin B was added to the 
IVF reaction at a concentration of 3 Lg ml! for 5h. In humans, a failure to 
extrude the second polar body occurs at a frequency of about 4% after fertiliza- 
tion by intracytoplasmic sperm injection”. 
ES cell derivation, embryo transfer and genotyping. For the derivation of 
mouse ES (mES) cells, cloned blastocysts were plated on irradiated mouse embry- 
onic fibroblast feeder layers in mES medium containing the mitogen-activated 
protein kinase inhibitor PD98059 (Cell Signalling) and LIF (Chemicon). Mitotic 
spreads of ES cells were made by incubating ES cells for 12h in 0.1 ugml! 
nocodazole to arrest them in mitosis; they were then treated with trypsin and 
incubated in 0.56% w/v KCl, stained with Hoechst and then fixed with a 1:3 
mixture of glacial acetic acid and methanol. Chimeric mice were made by injec- 
tion of ES cells into non-agouti BDF2 blastocysts and embryo transfer into day 2.5 
pseudopregnant albino ICR females. Cloned and control blastocysts were trans- 
ferred to the uterus of day 2.5 pseudopregnant ICR females. Caesarian section was 
performed on embryonic day 19.5. Surviving pups were fostered to an ICR foster 
mother that had given birth either on the same day or 1—4 days earlier. Primers for 
genotyping of puromycin, neomycin, interleukin-2 and T-cell antigen receptor 
genes are as described by the Jackson Laboratory (http://www.jax.org/). 
Experiments with animals were performed in accordance with the guidelines 
established by the Harvard University/Faculty of Arts and Sciences IACUC for 
the humane care and use of animals in research. 
Immunostaining. Preimplantation stage embryos were stained with a Cdx2 
antibody (Biogenex) as described*’. The secondary antibody was coupled to 
rhodamine-X. 

ES cells (Fig. 5) and somatic donor cells were stained with antibodies specific 
for OCT4 (sc5279; Santa Cruz) and SSEA-1 (sc21702; Santa Cruz). 
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DNA repair is limiting for haematopoietic 
stem cells during ageing 


Anastasia Nijnik’, Lisa Woodbine’, Caterina Marchetti*” 


, Sara Dawson’, Teresa Lambe’, Cong Liu’, 


Neil P. Rodrigues’, Tanya L. Crockford', Erik Cabuy®, Alessandro Vindigni’, Tariq Enver’, John I. Bell’, 
Predrag Slijepcevic®, Christopher C. Goodnow**, Penelope A. Jeggo** & Richard J. Cornall’* 


Accumulation of DNA damage leading to adult stem cell exhaustion has been proposed to be a principal mechanism of 
ageing. Here we address this question by taking advantage of the highly specific role of DNA ligase IV in the repair of DNA 
double-strand breaks by non-homologous end-joining, and by the discovery of a unique mouse strain with a hypomorphic 
Lig4’285¢ mutation. The Lig4”7°* mouse, identified by means of a mutagenesis screening programme, is a mouse model for 
human LIG4 syndrome, showing immunodeficiency and growth retardation. Diminished DNA double-strand break repair in 
the Lig4’78°° strain causes a progressive loss of haematopoietic stem cells and bone marrow cellularity during ageing, and 
severely impairs stem cell function in tissue culture and transplantation. The sensitivity of haematopoietic stem cells to 
non-homologous end-joining deficiency is therefore a key determinant of their ability to maintain themselves against 
physiological stress over time and to withstand culture and transplantation. 


Long-lived multicellular organisms depend on tissue replenishment 
from small pools of slowly dividing stem cells that must be self- 
renewed and maintained with a minimum of mutations throughout 
life’. This principle is best characterized for the haematopoietic sys- 
tem, which is maintained from small numbers of stem cells within the 
bone marrow. Haematopoietic failure follows the loss of haemato- 
poietic stem cells in humans and animals exposed to a threshold level 
of genotoxic agents such as ionizing radiation or cytotoxic drugs*”, 
and one of the most pathogenic forms of DNA damage in these situa- 
tions is a double-strand break*. Endogenous double-strand breaks 
arise predominantly from oxidative DNA damage caused by reactive 
oxygen species (ROS)° and are repaired by non-homologous end- 
joining (NHEJ)*’ and homologous recombination’. Selective induc- 
tion of senescence in haematopoietic stem cells by ionizing radiation’ 
and by elevated levels of ROS observed in ataxia-telangiectasia 
mutated-deficient (ATM~‘~) mice® indicates that double-strand 
break damage may limit haematopoietic stem cell function, and unre- 
paired double-strand breaks have been shown to accumulate in 
human and mouse tissues during ageing’. Furthermore, p53 and 
Rad50, which are involved in responses to genotoxic stress and repair, 
both affect the numbers and function of stem cells during ageing’®”’. 
However, the effects of physiological levels of double-strand break 
damage and the direct contribution of DNA repair pathways to hae- 
matopoietic stem cell maintenance in ageing are not known. 

Six components of NHEJ have been identified: Ku70, Ku80, DNA- 
PK,,, XRCC4, DNA ligase IV (LigIV)°’ and XRCC4-like factor 
(XLF)'*°. The viable Ku and DNA-PK. knockout mice are the cur- 
rently established models of NHEJ deficiency’. However, some end- 
joining can occur in the absence of Ku and DNA-PK,,"*, and these 
proteins have NHEJ-independent functions, including telomere 
maintenance’”’*. In contrast, LigIV is essential and has no known 
functions outside DNA repair by NHEJ; however, because of lethality 


of the Lig4-null mice'”"* there are no viable animal models of LigIV 
deficiency. Instead, hypomorphic mutations in LIG4 result in the 
human LigIV syndrome, characterized by growth defects, immuno- 
deficiency and hitherto unexplained pancytopenia’**. The same 
pattern of human disease is also induced by mutations in XLF'*”*”. 


Mouse model of human LiglV syndrome 

One of the principal advantages of mutagenesis induced by ethylni- 
trosourea (ENU) is its ability to reproduce under controlled condi- 
tions the most common type of natural human variation, by 
generating one round of single-nucleotide substitutions in a known 
genome sequence at a rate of about 1 per 10° bases”*?”. We therefore 
screened a library of C57BL/6 (B6) mice segregating ENU-induced 
substitutions for immunodeficiency phenotypes similar to LigIV 
syndrome by using flow cytometry, and identified a new strain, tiny, 
with reduced lymphocytes in peripheral blood (Fig. 1a) and growth 
retardation (Fig. 1b, c) inherited together as a recessive trait. The tiny 
mutation resulted in partial embryonic lethality on the inbred B6 
background, but homozygotes were born at about 40% of the 
expected frequency and grew to adulthood, although with small size. 
In a 550-animal F, intercross, the mutation mapped to a 2.30-mega- 
base (Mb) region between 8.54 and 10.84Mb on chromosome 8 
(Fig. 1d, and Supplementary Table 1). Complementary DNAs of 
the three known genes in the interval ( Tnfsf13b, Lig4 and Efnb2) were 
sequenced and a single 1067A->G substitution in Lig4 was identified 
and confirmed in genomic DNA (Fig. le). The mutation encodes a 
Y288C substitution in the catalytic domain (Fig. 1f). This Lig4”7°°° 
mutant is the first viable model of LigIV deficiency. 


Lig4°2®* encodes a hypomorphic mutation 
Y288C 


Examination of double-strand break repair in Lig4 mouse 
embryonic fibroblasts (MEFs) by y-H2AX foci analysis revealed a 
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similar repair defect to that in Lig4-null cells after 3-Gy y-irradiation 
(Fig. 2a, left panel). However, after 1-Gy y-irradiation, residual repair 
capacity was observed in Lig4*?*°° but not in Lig4-null MEFs, sug- 
gesting that Y288C creates a hypomorphic mutation that markedly 
impairs LigIV function (Fig. 2a, right panel). The repair defect was 
fully corrected by wild-type Lig4 cDNA (Fig. 2b). Western blotting 
and immunofluorescence showed that LigIV expression was reduced 
about fivefold in Lig4”?®°° MEFs (Fig. 2c, and Supplementary Fig. 
la). Low LigIV expression was not due to reduced expression of 
XRCC4, with which all LigIV is stably complexed** (Supplementary 
Fig. 1b). After immunoprecipitation with anti-XRCC4 antibodies, 
we observed a fivefold reduction in LigIV levels, but adenylation 
activity was undetectable, showing in vivo activity reduced at least 
tenfold (Fig. 2d—f). Recombinant human LigIV***®° interacted effi- 
ciently with XRCC4, but adenylation and double-strand break ligation 
activities of LiglV****“-XRCC4 complexes were decreased about two- 
fold (Fig. 2g, h, and Supplementary Fig. 1d—f). Co-expression of 
LiglV’***° and XRCC4 in rabbit reticulocyte lysates also showed effi- 
cient LiglV’***“_XRCC4 interaction (Supplementary Fig. 1c). Thus, 
the mutation affects the expression and catalytic function of LigIV, 
reducing in vivo activity at least tenfold, which is similar to human 
LigIV syndrome mutations’””. 


Diminished NHEJ limits cellular proliferation 


The mutation severely impaired the proliferation of Lig4*?*°° MEFs 
at both 20% O, and 3% Oy, (Fig. 3a). To investigate the basis of 
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Figure 1 | Identification of an ENU-induced missense substitution in LiglV 
in the tiny mouse strain. a, Flow cytometry of blood, stained for B220 and 
CD4, from B6 wild-type (WT) and Lig4*?8°° homozygous mice. b, c, Weight 
(b) and size (¢) of 8-week-old sex-matched Lig4??88C homozygous and wild- 
type B6 littermates (means and 95% confidence intervals are shown in 

b). d, Mapping the phenotype to chromosome 8. e, A1067G substitution in 
Lig4 cDNA. f, Location of the mutation in the domain structure of LigIV. 
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impaired proliferation, we examined the cells for y-H2AX and 
p1981-ATM foci. We observed higher numbers of foci in Lig4’?8°° 
MEFs; the foci increased with population doublings (Fig. 3b) and 
were more frequent and occurred at earlier passages at 20% O, than 
at 3% Oz, suggesting that they arise from oxidative damage. This was 
not due to differences in ROS levels in the MEFs (Supplementary Fig. 
2a). Reduced proliferation was also not attributable to telomere attri- 
tion, because the rate of telomere shortening with passage number 
was similar in wild-type and Lig4’*°8° MEFs (Fig. 3c), and the 
y-H2AX foci in Lig4’?**° MEFs were not located at telomeres 
(Supplementary Fig. 2b). We conclude that the reduced proliferation 
capacity in Lig4’**°° MEFs is due to the specific failure to repair 
double-strand breaks, probably arising from oxidative DNA damage. 
To assess whether such damage can arise in non-dividing cells, we 
maintained passage 1 MEFs in the plateau phase for 4 weeks. Once 
again, Liga®?88C MEFs accumulated increased y-H2AX foci, and this 
time with little protection afforded by lower O, tension (Fig. 3d). 
These findings show that NHEJ is important in repairing double- 
strand breaks that occur at physiological O, tension in both resting 
and proliferating cells. 


NHEJ maintains stem cells during ageing 

To examine the role of NHEJ in the maintenance of adult stem cells 
we focused on haematopoietic stem cells in bone marrow and enum- 
erated the well-defined population of multipotent KLS (c-Kit*, 
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Figure 2 | Impact of the Y288C mutation. a, y-H2AX foci analysis in 
plateau-phase wild-type (WT; white bars), Lig4”?°°° (grey bars) or 

Lig4 ‘p53 ’~ (black bars) MEFs after -irradiation at 3 Gy (left) or 1 Gy 
(right). b, Complementation of the double-strand break repair defect in 
transformed Lig4*?*°° MEFs after transfection with Myc-tagged WT human 
Lig4 cDNA; results are means and s.d. for three experiments. 

c, Immunofluorescence of WT, Lig4*8°° and Lig4 “p53 ’ MEFs showing 
reduced LigIV in Lig4***°° but normal levels of XRCC4. d, Lig4 western blot 
after co-immunoprecipitation with anti-XRCC4 antibodies from 500 ng of 
whole cell extract of WT, Lig4’”*°° heterozygote (LigIV"“) and homozygote 
(LigIlV*?*°°) MEFs. e, Adenylation activity of the same samples as in d, with 
300 ng of recombinant LigIiV-XRCC4 complex (LX) as control. f, Serial 
dilution of extracts from WT MEFs shows the limit of detectable adenylation 
activity to be 50 ng of extract. g, h, Adenylation of insect-cell-expressed 
human WT and Y288C-LX complexes pretreated with PP; (g), and double- 
strand break ligation of a 442-base-pair substrate by insect-cell-expressed 
human WT and Y288C-LX complexes (h) (see also Supplementary Fig. 1f). 


687 


©2007 Nature Publishing Group 


ARTICLES 


of KLS cells was within normal range at 5-12 weeks, but was 
decreased by 20-26 weeks compared with age-matched Lig4*/* 
(P<0.001) and 5—12-week Ligg 288¢ (P<0.01; Fig. 4a, c) mice. 
This contrasts with the expansion of the KLS population with age 
in wild-type B6 mice*’. The loss of the KLS cells correlated with a 
decline in bone marrow cellularity (P< 0.01) and erythrocyte pre- 
cursors (P< 0.05) in the Lig4’*8° mice between 5-12 weeks of age 
and 20-26 weeks, whereas the marrow cell count was unchanged 
between 5 and 30 weeks of life in wild-type controls (Fig. 4c). 

The KLS population includes CD34 Flt3™ long-term haemato- 
poietic stem cells, which give rise to CD34 Flt3~ short-term stem 
cells and then CD34*Flt3* multipotent progenitors (Fig. 4a). At 
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Figure 3 | Double-strand breaks accumulate in Lig4”2°°° embryonic 


fibroblasts independently of replication and confer impaired proliferation. 
a, Cumulative population doublings in wild-type (WT; squares and inverted 
triangles) and Lig4*?**° homozygous (upright triangles and diamonds) 
MEFs growing at 20% Op (inverted triangles and diamonds) or 3% O» 
(squares and upright triangles). b, y-H2AX and p1981-ATM foci in the same 
systems as a; results are means and s.d. for three experiments. c, Telomere 
length estimated by Flow-FISH in WT and Lig4*7°°° MEFs at passages 2 
(white bars), 5 (grey bars) and 8 (black bars) at the indicated oxygen 
tensions. d, y-H2AX and p1981-ATM foci in non-dividing WT and Lig4 
MEFs at 20% or 3% O, (mean and s.d. of three experiments). 
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20-26 weeks, the absolute number of cells was reduced in all KLS 
subcompartments in Lig4””°8° bone marrow, but CD34" FIt3” long- 
term haematopoietic stem cells were preserved as a percentage of the 
KLS compartment, and cell loss was most marked in the more rapidly 
dividing CD34" FIt3* multipotent progenitors (Fig. 4b). The relative 
preservation of long-term haematopoietic stem cells and diminution 
of multipotent progenitors was also seen in young (7—12-week) 
Lig4*?®®© mice (Supplementary Fig. 3a) and resembles the changes 
in the KLS subpopulations observed during normal ageing”. 


Mechanism of stem cell failure 


We next analysed the basis for the failure to maintain haematopoietic 
stem cell numbers with age in Lig4*?*8° mice, by using a range of 
functional assays. Co-transplantation experiments into irradiated 
recipients with CD45 allotype-marked bone marrow excluded the 
possibility that the haematopoietic stem cell defects in Lig4’?88° 
mice were secondary to abnormalities in bone marrow stroma or 
macrophages, instead showing that Lig4’?°°° marrow had a cell- 
autonomous deficiency in long-term haematopoietic stem cell func- 
tion (Fig. 5a). The intrinsic defect in Lig4*?**° bone marrow was not 
due simply to poor relative expansion of lineage-restricted progeny, 
because it affected the long-term reconstitution of KLS cells them- 
selves and that of downstream pro-B and granulocyte populations 
measured at 16 weeks in mice that had been injected with equal 
numbers of wild-type and Lig4’*°° donor KLS cells (Fig. 5b). In 
tissue culture, 18—26-week Liga (288C bone marrow stem cells also 
showed greatly impaired performance in the cobblestone-area- 
forming cell (CAFC) assay (Fig. 5c). 

To gauge the competitiveness of Lig4 stem cells in situ with- 
out removal from their physiological niche or exposure to high O, 
tension in tissue culture, we transferred 5 X 10° CD45.1-marked 
wild-type B6 bone marrow cells into non-irradiated 10-14-week 
B6 wild-type and Lig4’?*°° recipients. We also transferred CD45.2 
wild-type bone marrow into (CD45.1 X CD45.2)F, B6 mice to 
control for immunological rejection against CD45 (ref. 33). 
Remarkably, transplanted wild-type bone marrow stem cells made 
a long-term multi-lineage contribution to haematopoiesis in 
Lig4*?*8° recipients such that, by 14-17 weeks after transplanta- 
tion, 81+ 17% of KLS cells in the bone marrow and 92 + 2% 
(mean + s.d.) of granulocytes in the blood were donor derived 
(Fig. 5d). By contrast, little if any measurable stem cell engraftment 
occurred in wild-type recipients in which the stem cell niche was 
populated with normal self-renewing haematopoietic stem cells as 
expected. The competitive replacement of most host Lig4”?°° hae- 
matopoietic stem cells by a small number of donor wild-type hae- 
matopoietic stem cells indicates that there is a much greater defect in 
stem cell function than is reflected in steady-state long-term haema- 
topoietic stem cell numbers, the latter possibly being compensated 
for by increased proliferation. Examination of the rate of turnover of 
the haematopoietic stem cell populations in vivo by the incorpora- 
tion of bromodeoxyuridine (BrdU) revealed that twice as many KLS 
and long-term haematopoietic stem cells proliferated during a 40-h 
period in Lig4’*°°° compared with the wild type (Fig. 5e). Taken 
together with the diminished KLS population with age and the pan- 
cytopenia noted in human LigIV syndrome”, these results show that 
reduced NHE]J impairs the self-renewal function of long-term adult 
haematopoietic stem cells. 

The decline in number and marked functional impairment of hae- 
matopoietic stem cells in Lig4’?**° suggests that, as in the Lig4??°8° 
embryonic fibroblasts, double-strand breaks arise in haematopoietic 
stem cells under normal O, tensions. We observed increased double- 
strand breaks in bone marrow from rag] ‘~ Lig4’?°°° mice, showing 
the impact of the repair defect in vivo (see Methods). We observed 
a similar increase in double-strand breaks in flow-sorted KLS cells 
from 18-week-old mice (9 out of 572 (1.57%) in lige" versus 2 
out of 1,917 (0.10%) in wild-type; 7° = 21.61, P< 0.0001). The low 
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Figure 4 | Lig4¥28°° impairs the maintenance of 
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frequency of affected cells may be explained by the effective clearance 
of dying cells in bone marrow or their reduced proliferation potential. 


Conclusions 


As in the blood system, the longevity of many tissues depends 
upon continuous regeneration from small pools of self-renewing 
stem cells. By taking advantage of a unique mouse strain with a 
hypomorphic Lig4’**°° mutation and the highly specific role of 
LigIV in the repair of DNA double-strand breaks by NHE]J, we have 
shown that unrepaired DNA damage arising in stem cells under 
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physiological conditions in vivo leads to adult stem cell exhaustion 
over time***. On the basis of our analysis of MEFs in culture and bone 
marrow cells in vivo, we propose that unrepaired double-strand 
breaks in KLS cells and their progenitors lead to decreased prolifera- 
tive potential, increased turnover of the haematopoietic stem cell 
population, decreased self-renewal and hence age-dependent decline 
in multipotent cells within the KLS population, which becomes 
manifested as loss of bone marrow cellularity and erythropoiesis 
(Fig. 4c), features characteristic of normal ageing”. Similar prolife- 
rative exhaustion of stem cells has been reported for mice deficient in 


Figure 5 | Lig4’7°°° impairs the intrinsic 
function of adult haematopoietic stem cells. 
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p21 and Gfi-1 (refs 35, 36). It is also possible that stem cells have low 
thresholds for damage checkpoint activation, exploiting apoptosis or 
checkpoint arrest to limit the potential harmful impact of genetic 
damage, which may contribute to their exhaustion during ageing*””*. 
The rate-limiting role for the repair of DNA double-strand breaks 
in the maintenance of adult stem cells in vivo, established here for 
blood stem cells, implies that inherited or environmental factors that 
increase oxidative DNA damage may be key determinants of tissue 
ageing and limiting factors for the efficient transplantation of adult 
stem cells. 


METHODS SUMMARY 


ENU mutagenesis and mapping was performed as described previously”®. 
Primary unfixed cells were sorted by flow cytometry without prior depletion 
of lineage-positive cells. Primary MEFs were derived from wild-type and 
Lig4’?8°° embryos, and double-strand break repair was analysed by y-H2AX foci 
analysis. Gene Juice was used for transfection experiments. The TNT T7 Quick 
Coupled Transcription/Translation System (Promega) was used to generate LX 
complexes in reticulocyte lysates. Recombinant human XRCC4 and histidine- 
tagged wild-type and LiglV****° complexes were expressed in pFastBac vectors 
and purified from insect cells*. Complexes were assessed for function with the 
use of anti-XRCC4 immunoprecipitation, double-strand ligation and adenyla- 
tion assays as described previously'’. Immunofluorescence was used to assess 
y-H2AX foci in confluent, non-dividing MEFs", and flow-sorted KLS cells were 
stained with anti-53BP1. Telomere length was measured by fluorescence in situ 
hybridization (FISH) with Flow-FISH”, and the co-localization of y-H2AX foci 
with telomeres was monitored by immuno-FISH. The CAFC assay" and single 
and mixed radiation chimaeras”* were as described previously. In bone marrow 
transfers, 5 X 10° CD45.1-marked wild-type B6 bone marrow cells were intra- 
venously injected into non-irradiated CD45.2-marked wild-type and CD45.2- 
marked Lig4”**°° B6 mice; and CD45.2-marked wild-type cells were injected into 
CD45.1 X CD45.2 F, wild-type recipients to control for immunological rejec- 
tion’’. Cell turnover was assessed in mice injected once and then fed with BrdU 
for 40 h (BrdU Flow Kit; BD Pharmingen). Statistical analysis was by analysis of 
variance or t-test with GraphPad Prism 4.00. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

ENU mutagenesis. ENU mutagenesis was performed as described previously”®. 
Experiments were approved by the Australian National University Animal Ethics 
and Experimentation Committee, and by the Oxford University Ethical Review 
Committee and UK Home Office Licence. 

Complementation. Transformed Lig4’?*°° MEFs were transfected three times 
by using Gene Juice with Myc-tagged human Lig4 cDNA cloned into a pCI- 
puro(myc) plasmid. Transfectants, identified by using the Myc tag, were 
scored for 53BP1 foci formation at the indicated times after exposure to 3 Gy 
of y-radiation. 53BP1 rather than y-H2AX foci were analysed because of the 
availability of secondary antibody, but control experiments have shown that the 
two analyses provide identical results. The transfection frequency was 10%. 
Results are the means and s.d. for three experiments. 

Immunofluorescence for LigIV and XRCC4 expression. MEFs from the wild- 
type, Ligd’?**° and Lig4-null mice (Lig4 ‘p53’ ) were stained with anti-LigIV 
and anti-XRCC4 antibodies from Serotec. 

Biochemical procedures. Preparation of whole cell extracts, anti-XRCC4 
immunoprecipitation and adenylation of the endogenous LigIV were performed 
as described previously’’. Because LX complexes expressed in vivo are pre- 
adenylated, immunoprecipitated complexes were pretreated with 5mM diso- 
dium pyrophosphate (PP;), as described previously, before the addition of 
labelled ATP to monitor adenylation activity'’. Specific antibodies used for 
immunoprecipitation and the western blots were anti-XRCC4 (AHP387) and 
anti-DNA LigIV (AHP554) (Serotec). 

The Promega TNT T7 Quick Coupled Transcription/Translation System was 
used for transcription and translation of LX complexes in vitro, as in 
Supplementary Fig. 1c. The vector pcDNA3 (Invitrogen) expressing wild-type 
human XRCC4 or human LigIV was used in these experiments and for site- 
directed mutagenesis. 

pFastBac vectors expressing untagged human XRCC4 and histidine-tagged 
LigIV wild-type and Y288C mutant in insect cells were used to generate wild-type 
and Y288C LiglV-XRCC4 complexes. 

Protein expression and purification were performed as described previously”. 
The concentration of the LiglV-XRCC4 complexes was determined by ultra- 
violet absorption measurements, with an extinction coefficient at 280nm of 
121,330 M~!cm7! estimated from the amino acid sequence (ProtParam, avail- 
able at http://www.expasy.ch/). The adenylation and double-strand ligation 
assays with recombinant Lig]iV-XRCC4 complexes were as described prev- 
iously*’. For adenylation assays, wild-type and Y288C protein complexes were 
pretreated with 5 mM PP. For double-strand ligation assays, 20 ng of a 442-base- 
pair (bp) double-stranded DNA fragment with 4-bp overhangs at each end was 
produced from the pBluescript plasmid (Stratagene), 5’-end-labelled and incu- 
bated with LigIV-XRCC4 complexes without PP; pretreatment and in the 
absence of ATP. 

Cell culture and lifespan analysis. Primary MEFs derived from wild-type and 
Lig4*?55° embryos were cultured in DMEM medium containing 15% heat-inac- 
tivated FCS. Population doublings (PDs) were determined by counting cell 
numbers after each passage. At later PDs, rapidly growing transformed cells were 
evident in some populations. Data shown for PDs are before such outgrowth. 
The results shown represent a single experiment but similar results were obtained 
in three independent experiments. 

Foci analysis in vitro. Foci analysis was undertaken in non-dividing confluent 
cells as described previously'*. Specific antibodies used were anti-phospho- 
H2AX (Ser 139) mouse monoclonal (Upstate Cell Signalling Solutions) and 
anti-ATM pS1981 (rabbit) (Rockland Immunochemicals). The y-H2AX and 
p1981ATM foci counts were nearly identical in all experiments. 

Examination of bone marrow cells for unrepaired double-strand breaks in 
vivo. Bone marrow cells were extracted from rag] / Lig4**°° mice to establish 
whether double-strand break DNA damage accumulated in Lig4’**“ cells inde- 
pendently of the activity of RAG (recombinase-activating gene) in vivo. Cells 
were carefully deposited on coverslips after cytospinning and were examined 
with anti-y-H2AX (Ser139) mouse monoclonal (Upstate Cell Signalling 
Solutions), polyclonal rabbit anti-53BP1 (Bethyl), anti-rabbit Cy3 and anti- 
mouse fluorescein isothiocyanate (FITC) (Dako) antibodies. Cells with overlap- 
ping foci for both markers were then scored. Previous analysis with MEFs 
exposed to ionizing radiation has shown that y-H2AX and 53BP1 foci analysis 
yield identical results. This procedure decreased problems with background 
signal from each single antibody. Positive cells normally displayed only one or 
two foci per cell. Bone marrow cells were examined from four rag] Lig4’?8°° 
and four rag!’ Lig4’’ mice. The age of the mice ranged from 9 to 13 weeks, 
and mice were age- and sex-matched between the groups. Using this approach 
we found a mean of 0.3% cells with foci in total bone marrow from 
rag] ‘ Lig4’” mice (individual percentages 0 out of 525, 0.26%, 0.29% and 
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0.8%), in contrast with 2.6% of cells in ragl “ LiglV'*"°° mice (individual 
percentages 0.7%, 2.1%, 2.6% and 5.6%). 

For analysis of KLS cells, cold 10,000—50,000 unfixed stained KLS cells were 
sorted from whole bone marrow, without previous depletion, with a MoFlow 
Cytometer (Dako Cytomation) into a volume of 0.2 ml and were cytospun 
and stained with polyclonal rabbit anti-53BP1 and anti-rabbit FITC. Sorting 
yielded similar numbers of cells with foci to whole bone marrow (see the text). 
In this analysis only cells displaying 53BP1 foci were scored, because the use 
of fluorescent antibodies during FACS sorting excluded the use of multiple 
fluorochromes. 

ROS detection. Analysis was undertaken on confluent cells grown in 3% O, using 
the Image-iT LIVE green ROS detection kit (Molecular Probes). Essentially, cells 
were labelled with or without 10 14M 2',7’-dichlorodihydrofluorescein diacetate 
(H2DCFDA). After washing, the relevant cells were treated with 0.5 M tert-butyl 
hydroperoxide, a stable agent that induces oxidative damage providing a positive 
control. After further washing, cells were trypsin-treated, fixed and analysed by 
flow cytometry for measurement of H2DCFDA fluorescence. 

Telomere analysis. Telomere length was measured by Flow-FISH as described 
previously”. Co-localization of y-H2AX foci with telomeres was examined by a 
standard immuno-FISH procedure, staining y-H2AX with monoclonal anti- 
phospho-H2AX antibody (Upstate), followed by the FISH protocol using 
Cy3-labelled telomeric probe. No chromosome aberrations, telomere/break 
fusions or loss of telomeric signals in individual chromosomes were observed 
by Q-FISH. 

Flow cytometry. The following anti-mouse monoclonal antibodies were used: 
FITC antibodies against CD34 (RAM34), CD45.1 (A20), GR1 (RB6-8C5) (all 
from BD Pharmingen), B220, CD4, CD8, CD11b/MACI (all from Caltag), phy- 
coerythrin (PE) anti-Scal (E13-161.7), allophycocyanin (APC) anti-c-Kit (2B8), 
biotin anti-TER119 (all from BD Pharmingen), biotin anti-Flt3 (A2F10, 
eBioscience), PE-Cy7 anti-c-Kit (2B8, Insight Biotech) and AlexaFluor-647 
anti-CD34 (RAM34, Insight Biotech). Unconjugated rat anti-mouse lineage 
markers (BD Pharmingen), tricolour goat anti-rat IgG, (Caltag), streptavidin- 
tricolour (Caltag) and streptavidin-APC-Cy7 (BD Pharmingen) were also used. 
The data were collected with a FACSCanto (Becton Dickinson). 

Bone marrow chimaeras. CD45 allotype-marked B6 mice were y-irradiated 
with two 4.5-Gy doses 3h apart, and injected intravenously with CD45.1-WT 
and CD45.2-Lig4’?**° bone marrow cells, either separately or mixed in a 1:1 
ratio. Either 5 X 10° cells were injected into each recipient (Fig. 5a) or more 
Lig4*?®°° cells were injected to control for the reduced proportion of KLS cells in 
the Lig4’**°° donor bone marrow cells (Fig. 5b). The recipients were kept on 
0.25 mg ml! amoxycillin. 

Bone marrow transfers. CD45.1-WT B6 bone marrow cells (5 X 10°) were 
intravenously injected into non-irradiated CD45.2-WT and CD45.2-Lig4’?*8° 
B6 mice; CD45.2-WT cells were injected into CD45.1 X CD45.2 F, wild-type 
recipients to control for immunological rejection”’. 

BrdU incorporation. Mice were injected with 1.5—3.0 mg of BrdU in PBS (1 mg 
per 10g body weight), and fed with BrdU at 1mgml’ in drinking water, 
protected from light and supplemented with 1% glucose to prevent taste aver- 
sion, for 40 h. BrdU incorporation was assessed with a FITC BrdU Flow Kit (BD 
Pharmingen) in accordance with the manufacturer’s protocol. 

CAFC assay. The CAFC assay was performed as described previously"’. 
Statistical analysis. Analysis of variance or t-test was performed with GraphPad 
Prism 4.00 (GraphPad Inc.). 
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Of the over 200 known extrasolar planets, just 14 pass in front of 
and behind their parent stars as seen from Earth. This fortuitous 
geometry allows direct determination of many planetary prop- 
erties'. Previous reports of planetary thermal emission”~ give 
fluxes that are roughly consistent with predictions based on ther- 
mal equilibrium with the planets’ received radiation, assuming a 
Bond albedo of ~0.3. Here we report direct detection of thermal 
emission from the smallest known transiting planet, HD 149026b, 
that indicates a brightness temperature (an expression of flux) of 
2,300 + 200 K at 8 pm. The planet’s predicted temperature for uni- 
form, spherical, blackbody emission and zero albedo (unpreced- 
ented for planets) is 1,741 K. As models with non-zero albedo are 
cooler, this essentially eliminates uniform blackbody models, and 
may also require an albedo lower than any measured for a planet, 
very strong 81m emission, strong temporal variability, or a heat 
source other than stellar radiation. On the other hand, an instant- 
aneous re-emission blackbody model, in which each patch of 
surface area instantly re-emits all received light, matches the 
data. This planet is known*” to be enriched in heavy elements, 
which may give rise to novel atmospheric properties yet to be 
investigated. 

The dimming of infrared light as a planet passes behind its star 
(secondary eclipse) yields the planet’s intrinsic day-side emission, 
which can be expressed as a brightness temperature (see Fig. 1, and 
its legend for definition of several variables). We observed a second- 
ary eclipse of HD 149026b on 24 August 2005 with the 8 im channel 
of the Infrared Array Camera”’ on the Spitzer Space Telescope’’. The 
instrument is very stable, but operates near its limits for this obser- 
vation. We correct several instrumental effects with a model, detailed 
in the Supplementary Information, that achieves over 80% of the 
photon-limited signal-to-noise ratio. Figure 2 presents binned, raw 
photometry showing that the eclipse is evident even without this 
analysis model. Figure 3 shows the eclipse with two different bin- 
nings, after removing instrumental effects. The Supplementary 
Information also presents unbinned plots, the photometric data in 
machine-readable format, and the details of a method for calculating 
T, without reference to the uncertain system distance. Table 1 gives 
the model fit results, including the unexpectedly large eclipse depth. 
Figure 4 shows that even an F= 1, A = 0, uniform, blackbody model 
is unlikely to explain that depth. 

An A= 0, F= 3/8 (instantaneous re-emission), local, blackbody 
model matches the data well. Here, local thermal equilibrium with 
the incoming stellar radiation determines the temperature of each 
point on the planetary surface. This model has a substellar temper- 
ature of nearly 2,500 K and T.g = 2,200 K. This scenario is plausible 
if the bulk of stellar energy is absorbed in an atmospheric region 
where the re-radiation timescale is much shorter than the timescale 
needed to advect heat around the planet'”. A team (including us) 
recently observed’* the latter condition on the non-transiting 


hot-Jupiter planet v Andromedae b, although an albedo was not 
robustly determined. 

If the planetary spectrum were that of a blackbody, the zero- 
albedo, instantaneous re-radiation Teg would match the reported 
Tp. The spectrum could resemble a blackbody if most of the absorp- 
tion occurred in a thick cloud of high-temperature condensates. Both 
the high metallicity of the parent star® ([Fe/H] = +0.36) and the 
enriched planetary composition®”’ argue for a complex atmospheric 
chemistry. Condensing iron and forsterite are predicted for ~1,100K 
exoplanets’*'’. These materials are common in terrestrial rocks and, 
even as liquids, have plausibly low reflectivity. However, they are not 
expected to condense at 2,300K, so this model is not attractive. 
Furthermore, most models of hot-Jupiter spectra depart significantly 
from a blackbody. For a variety of temperatures and atomic abun- 
dances, water vapour is the most spectroscopically active molecule 
expected in the 8 im region, but it appears as an absorber that sup- 
presses the spectrum there. 
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Figure 1| Comparison of exoplanetary brightness T, versus equilibrium 
temperatures T,, for a Bond albedo (wavelength-independent reflectivity) 
A of 0.3 and uniform planetary emission. T, is the temperature of a 
blackbody that would produce the same flux as was measured in a given 
filter. It is a statement of the measured flux (see Supplementary 
Information). Variations in T;, with wavelength are deviations from a 
blackbody spectrum, and yield insight into atmospheric properties. The 
effective temperature T.¢ is that for which a blackbody would emit the same 
total flux as measured from the body. One can predict T.¢ by assuming a 
value for A and balancing incoming stellar radiation with blackbody thermal 


emission at Tz, = T, 4) ue (®) ue 


iF , where T+ and R: are the stellar Ter and 
radius, respectively, ais the distance between the planet and the star, and F is 
equivalent to the fraction of the planet’s surface that emits at T,,. F = 1 for 
uniform planetary emission, F = 1/2 for uniform hemispheric emission 
(which is unphysical), and F = 3/8 for instantaneous re-emission of 
absorbed radiation without advection. In the latter case, each element of the 
surface acts as an isolated blackbody with its own temperature. Alternatively, 
if one assumes that T, = T., one can estimate A, to the extent that the 
planet’s spectrum matches that of a blackbody. Objects named in the key 
(TrES-1, HD 149026b, and so on) are extrasolar planets. 
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An alternative explanation for the high 8 um Ty, is thermal emis- 
sion from an inversion layer (a region where temperature increases 
with altitude). Emission from H,O could then create a high day-side 
Tp» While the planet’s Ta remained consistent with uniform redis- 
tribution of the incident stellar irradiation (=1,741 K). Because one 
expects the stellar radiation to heat mainly the lower regions of the 
atmosphere, a very strong absorbing gas or solid species must act at 
altitude to create this inversion. 

Fortney et al.’ raised the possibility that TiO and VO gas molecules 
in an atmosphere enriched in heavy elements are the absorbers 
responsible for the inversion layer. This model predicts a contrast 
ratio for uniform emission of 0.00067, approaching within 1.40 of 
our measurement, for solar levels of Ti, ten times the solar abundance 
of other heavy elements, and uniform planetary emission. Two other 
models of ref. 7 match our observation very well. However, the paper 
also notes that TiO should condense at a temperature minimum 
much deeper than the levels observed here, in a cold trap. The air 
above that level should be dry of TiO, just as Earth’s stratosphere is 
dry due to trapping of water below the tropopause. The matching 
models both have abundant TiO at altitude, violating the cold trap, 
and it is not yet clear whether even the solar level of Ti is consistent 
with a cold trap. The TiO scenario could work, however, if the 
temperature is above the TiO condensation point at all depths, which 
may be plausible. A detailed investigation of TiO and VO is war- 
ranted, coupled to age-dependent luminosity models under high 
metallicity that may cause a hotter adiabat’®, and atmospheric 
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Figure 2 | Unprocessed photometric light curve for the 8 1m secondary 
eclipse of HD 149026b. Each point is the average of ~4,000 frames taken in 
exactly one cycle of nine telescope positions. This binning eliminates several 
instrumental effects from the plot, allowing the eclipse (lowest line and data 
points) to be distinguished from the fitted stellar model without an eclipse 
(top line). Phase refers to the orbit, where 0.5 is halfway between transits. 
The overall rise in the signal is a known instrumental effect” that we model 
by fitting an asymptotic function with three free parameters (top line, see 
Supplementary Information). It is removed in Fig. 3. Error bars and 
uncertainties throughout the paper are lo. The star is bright, but use of the 
32 X 32-pixel subarray mode, a 0.4s frame rate, and a nine-position, non- 
random dither pattern avoided saturating the detector. We cycled the 
telescope around the dither pattern 12 times and took 448 frames at each 
visit to a position. The six-hour sequence produced 48,145 usable images. 
Aperture photometry of each image produced a time series for the event. We 
account for sub-pixel motions of the stellar image by subdividing each pixel 
and including only the portion that falls inside the aperture or sky annulus, 
as appropriate. The Supplementary Information describes the photometry 
and analysis in detail, including the modelling and removal of several 
previously unidentified systematics. The modelling was necessary to obtain 
accurate timing, but the eclipse depth derived by an independent analysis 
(by us) without the model for new systematics agrees closely with that 
presented here. We used a different functional form for the overall signal rise 
in that analysis, as a consistency check. 
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Figure 3 | Binned light curves, corrected for instrumental effects (see 
Supplementary Information). a, Same bins, horizontal axis, and lines as 
Fig. 2. b, One bin per telescope pointing, 108 in all, with only representative 
lo error bars for clarity. Unity on the vertical axis represents the system flux 
outside of eclipse. The eclipse model is a series of five line segments: two that 
are constant before and after the eclipse, one that is constant at the in-eclipse 
level, and two mirror-symmetric segments that connect the other three. The 
latter represent the planet’s crossing of the stellar limb with sufficient 
accuracy for our purposes. The eclipse’s mid-time, duration, and fractional 
depth are free parameters. The duration of stellar limb crossing was not well 
determined, so we fixed it at a nominal value. We fitted only to unbinned 
data. For plotting purposes, we evaluated the fitted eclipse model at the time 
of each data point, binned it identically to the data, and connected those 
points with line segments. The plotted line is thus a binned version of the 
model, not the model itself. 


circulation models that will clarify the temperature distribution in 
the lower atmosphere”. 

If the eclipse mid-time of a transiting planet is not exactly 
centred between transit times (orbital phase = 0.5), it indicates either 
a non-circular orbit, perturbations by other planets in the system’, 
or (for small time offsets) an offset hotspot on the day side of the 
planet'®*’. Our eclipse mid-time is heliocentric Julian date (HJD) 
2,453,606.960 + 0.001. We received an ephemeris for the planet from 
the team of ref. 21 that adds several unpublished, ground-based 
observations to those already published®” to predict a substantially 
more precise mid-transit time of HJD 2,453,606.962 + 0.001, in- 
cluding a 42s correction for light-travel time across the orbit of 
HD 149026b. This prediction is consistent with our measurement 
at the 1c level. 

We emphasize that our single measurement is not enough to con- 
firm any one atmospheric scenario. Alternative scenarios for the high 
T, should be investigated, such as tidal heating” and spatio-temporal 
variability around a lower average flux”. A recent spectrum” of 


Table 1| Model parameters 


Parameter Value lo uncertainty S/N 
Planet-star contrast, f 0.00084 —0.00012 +0.00009 7.9 
Eclipse centre, t, —0:02:58  —0:02:30 +0:01:40 1,994 
Eclipse duration, tg 3:13:12 —0:03:33 +0:04:08 50 
Eclipse limb-crossing time, t, 0:11:49.98 Fixed Fixed eis 
Rise-time offset (phase), to 0.251 —0.012 +0.006 27 

Rise goal (uy), | 126,862 —40 +56 2,621 
Rise exponent (phase 7), m 19.50 —12 +0.7 21 

S/N is the signal (or systematic error parameter)-to-noise ratio, where N = half of the 10 
uncertainty range. The eclipse centre is the time relative to ephemeris phase 0.5, corrected for 
light-travel time (see text). Times are in h:mmiss format. The eclipse duration is from half-light 


to half-light; add t, for the duration from first to last contacts. See Supplementary Information for 
the variables; to, |and m are parameters for the rise in Fig. 2. Eclipse centre S/Nis relative to the 
full orbit period of 2.876 days. 
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Figure 4 | Equilibrium temperature parameters. We refer to the 
assumptions described in Fig. 1. The lines show the relationship between A 
and F for a given T,g. The bold line is our calculated T,, and the others are 
spaced at 1o intervals. The line representing the data does not approach the 
lower-right corner, ruling out the often-assumed uniform blackbody 
emission at the 3a level, even for 5% albedo. 


HD 209458b tentatively shows a small emission peak in our band- 
pass, but the feature is too narrow and weak to explain HD 149026b’s 
enhanced brightness on its own. Indeed, it was not identified in a 
spectrum” of HD 189733b. We thus call for further measurements of 
HD 149026b, at 8 tm to search for variability, and at other wave- 
lengths to test the inversion interpretation by searching for depar- 
tures from blackbody emission. 


Received 28 January; accepted 17 April 2007. 
Published online 9 May 2007. 


1... Charbonneau, D., Brown, T. M., Burrows, A. & Laughlin, G. in Protostars and Planets 
V (eds Reipurth, B., Jewitt, D. & Keil, K.) 701-716 (Univ. Arizona Press, Tucson, 
Arizona, 2007). 

2. Charbonneau, D. et al. Detection of thermal emission from an extrasolar planet. 
Astrophys. J. 626, 523-529 (2005). 

3. Deming, D., Seager, S., Richardson, L. J. & Harrington, J. Infrared radiation from an 
extrasolar planet. Nature 434, 740-743 (2005). 

4. Deming, D., Harrington, J., Seager, S. & Richardson, L. J. Strong infrared emission 
from the extrasolar planet HD 189733b. Astrophys. J. 644, 560-564 (2006). 

5. Knutson, H. A. et al. A map of the day-night contrast of the extrasolar planet HD 
189733b. Nature 447, 183-186 (2007). 

6. Sato, B. et al. The N2K Consortium. II. A transiting hot Saturn around HD 149026 
with a large dense core. Astrophys. J. 633, 465-473 (2005). 

7. Fortney, J. J., Saumon, D., Marley, M. S., Lodders, K. & Freedman, R. S. 
Atmosphere, interior, and evolution of the metal-rich transiting planet HD 
149026b. Astrophys. J. 642, 495-504 (2006). 

8. lkoma, M., Guillot, T., Genda, H., Tanigawa, T. & Ida, S. On the origin of HD 
149026b. Astrophys. J. 650, 1150-1159 (2006). 

9. Broeg, C. & Wuchterl, G. The formation of HD 149026b. Mon. Not. R. Astron. Soc. 
376, L62-L66 (2007). 

10. Fazio, G. G. et al. The Infrared Array Camera (IRAC) for the Spitzer Space 
Telescope. Astrophys. J. Suppl. 154, 10-17 (2004). 


LETTERS 


1. Werner, M. W. et al. The Spitzer Space Telescope Mission. Astrophys. J. Suppl. 154, 
1-9 (2004). 

2. Seager, S. et al. On the dayside thermal emission of hot Jupiters. Astrophys. J. 632, 
1122-1131 (2005). 

3. Harrington, J. et al. The phase-dependent infrared brightness of the extrasolar 
planet v Andromedae b. Science 314, 623-626 (2006). 

4. Cooper, C.S., Sudarsky, D., Milsom, J. A., Lunine, J. |. & Burrows, A. Modeling the 
formation of clouds in brown dwarf atmospheres. Astrophys. J. 586, 1320-1337 
(2003); erratum 595, 573 (2003). 

5. Fortney, J. J. The effect of condensates on the characterization of transiting planet 
atmospheres with transmission spectroscopy. Mon. Not. R. Astron. Soc. 364, 
649-653 (2005). 

6. Burrows, A., Hubeny, |., Budaj, J. & Hubbard, W. B. Possible solutions to the radius 
anomalies of transiting giant planets. Astrophys. J. (in the press); preprint at 
(http://arxiv.org/astro-ph/0612703) (2007). 

7. Cooper, C. S.& Showman, A. P. Dynamics and disequilibrium carbon chemistry in 

hot Jupiter atmospheres, with application to HD 209458b. Astrophys. J. 649, 

048-1063 (2006). 

8. Holman, M. J. & Murray, N. W. The use of transit timing to detect terrestrial-mass 

extrasolar planets. Science 307, 1288-1291 (2005). 

9. Williams, P. K. G., Charbonneau, D., Cooper, C. S., Showman, A. P. & Fortney, J. J. 

Resolving the surfaces of extrasolar planets with secondary eclipse light curves. 

Astrophys. J. 649, 1020-1027 (2006). 

20. Rauscher, E. et al. Toward eclipse mapping of hot Jupiters. Astrophys. J. (in the 

press); preprint at (http://arxiv.org/astro-ph/0612412) (2007). 

21. Holman, M. J. et al. The Transit Light Curve Project. |. Four consecutive transits of 

the exoplanet XO-1b. Astrophys. J. 652, 1715-1723 (2006). 

22. Charbonneau, D. et al. Transit photometry of the core-dominated planet HD 

49026b. Astrophys. J. 636, 445-452 (2006). 

23. Levrard, B. et al. Tidal dissipation within hot Jupiters: a new appraisal. Astron. 

Astrophys. 462, L5-L8 (2007). 

24. Rauscher, E., Menou, K., Cho, J. Y.-K., Seager, S. & Hansen, B. Hot Jupiter 

variability in eclipse depth. Astrophys. J. (in the press); preprint at (http:// 

arxiv.org/astro-ph/0612413) (2007). 

25. Richardson, L. J., Deming, D., Horning, K., Seager, S. & Harrington, J. A spectrum of 

an extrasolar planet. Nature 445, 892-895 (2007). 
26. Grillmair, C.J. et al. A Spitzer spectrum of the exoplanet HD 189733b. Astrophys. J. 
658, L115-L118 (2007). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank Spitzer's Director for discretionary time; G. Squires, 
and the Spitzer staff for rapid proposal handling and scheduling; B. Hansen, C. Lisse, 
T. Loredo, and W. T. Reach for discussions; and A. Wolf, J. Winn, G. Henry, 

M. Holman, H. Knutson and D. Charbonneau for discussions and for sharing results 
before publication. W. Bowman assisted in preparing Fig. 1. We thank 

C. Markwardt, the Free Software Foundation, W. Landsman, other contributors to 
the Interactive Data Language Astronomy Library, and the open-source 
community for software. This work is based on observations made with the Spitzer 
Space Telescope, which is operated by the Jet Propulsion Laboratory, California 
Institute of Technology, under a contract with NASA. This material is based upon 
work supported by the US National Science Foundation and by the US National 
Aeronautics and Space Administration through an award issued by JPL/Caltech. 


Author Information The original data are available from the Spitzer Space 
Telescope archive, program 254. Reprints and permissions information is available 
at www.nature.com/reprints. The authors declare no competing financial 
interests. Correspondence and requests for materials should be addressed to J.H. 
(jharring@physics.ucf.edu). 


693 


©2007 Nature Publishing Group 


nature 


LETTERS 


Vol 447|7 June 2007|doi:10.1038/nature05897 


High-resolution, high-sensitivity NMR of nanolitre 
anisotropic samples by coil spinning 


D. Sakellariou', G. Le Goff? & J.-F. Jacquinot” 


Nuclear magnetic resonance (NMR) can probe the local structure 
and dynamic properties of liquids and solids, making it one of the 
most powerful and versatile analytical methods available today. 
However, its intrinsically low sensitivity precludes NMR analysis 
of very small samples—as frequently used when studying isotopi- 
cally labelled biological molecules or advanced materials, or as 
preferred when conducting high-throughput screening of bio- 
logical samples or ‘lab-on-a-chip’ studies. The sensitivity of 
NMR has been improved by using static micro-coils', alternative 
detection schemes” and pre-polarization approaches*. But these 
strategies cannot be easily used in NMR experiments involving the 
fast sample spinning essential for obtaining well-resolved spec- 
tra®*° from non-liquid samples. Here we demonstrate that induc- 
tive coupling allows wireless transmission of radio-frequency 
pulses and the reception of NMR signals under fast spinning of 
both detector coil and sample. This enables NMR measurements 
characterized by an optimal filling factor, very high radio- 
frequency field amplitudes and enhanced sensitivity that increases 
with decreasing sample volume. Signals obtained for nanolitre- 
sized samples of organic powders and biological tissue increase 
by almost one order of magnitude (or, equivalently, are acquired 
two orders of magnitude faster), compared to standard NMR 
measurements. Our approach also offers optimal sensitivity when 
studying samples that need to be confined inside multiple safety 
barriers, such as radioactive materials. In principle, the co- 
rotation of a micrometre-sized detector coil with the sample and 
the use of inductive coupling (techniques that are at the heart of 
our method) should enable highly sensitive NMR measurements 
on any mass-limited sample that requires fast mechanical rotation 
to obtain well-resolved spectra. The method is easy to implement 
on a commercial NMR set-up and exhibits improved performance 
with miniaturization, and we accordingly expect that it will facil- 
itate the development of novel solid-state NMR methodologies 
and find wide use in high-throughput chemical and biomedical 
analysis. 

According to the principle of reciprocity’, the experimental para- 
meters crucial for highly sensitive detection of the Faraday induction 
that gives rise to NMR signals are the ratio between the sample 
volume and the coil volume (the so-called filling factor), and the 
temperature and size of the sensor used. The introduction of 
micro-coils thus substantially improved the detection sensitivity of 
liquid-state NMR! of isotropic fluids, where brownian motion 
averages orientation-dependent (so-called anisotropic) interactions 
to yield narrow, chemically resolved spectra. Micro-coils owe much 
of their broad applicability to the fact that magnetic susceptibility 
line-broadening can be eliminated by either surrounding the coil 
with a susceptibility matching liquid, or using specially coated wires®. 
They have had a large effect on combined analytical methods (such as 


liquid chromatography (LC)-NMR), and even enabled micro-NMR 
and micro-magnetic resonance imaging (MRI) studies of single 
neurons””®. 

Micro-coil NMR has recently also been used on static solid 
samples’’’*. But in solids, the averaging of anisotropic interactions— 
essential if high-resolution spectra with narrow lines are to be 
obtained—can only be achieved by sample spinning®® (so-called 
magic-angle spinning, MAS). So far, there is no obvious hardware 
design for a spinning micro-rotor’’ that would allow a micro-coil 
with a diameter of typically less than 1 mm to be placed around a 
sample spun many thousands of turns per second about an axis 
making the ‘magic’ angle 0 = Oyyas of 54.7° with the external static 
magnetic field. The fact that MAS NMR requires a (static) sensor to be 
placed next to a rapidly rotating sample is also the reason why alterna- 
tive detection schemes such as mechanical’ or optical’ detection have 
not yet been used for solid-state NMR: these schemes can only be used 
with static samples, so any improvement in detection sensitivity 
comes at the expense of poorer spectral resolution. Similarly, efforts 
to improve detector performance by lowering its temperature (cryo- 
cooling) seem very promising’ but face a major engineering chal- 
lenge when MAS is involved. Combining cryo-cooling of the detector 
and MAS will also result in a massive reduction in filling factor due to 
the need to keep the sample at room temperature. Finally, we note 
that all previous ‘fixed-coil’ approaches are optimal for a certain 
amount of sample filling the coil, and that their sensitivity decreases 
as smaller samples are used. 

Here we present ‘magic angle coil spinning’ (MACS) as an alterna- 
tive approach; this uses wireless inductive coupling between the static 
coil that is normally used for sample spin manipulation and signal 
reception, and a tuned micro-coil that is co-rotating with the sample 
container (usually referred to as the ‘rotor’). Inductive coupling is a 
well-known mode of electromagnetic coupling” that requires no 
tethering of wires to the terminals of the coil. It is widely used in 
telemetry sensors’®, microelectromechanical systems (MEMS)’’, and 
MRI instrumentation"* with either standard”, cryogenically cooled”® 
or implanted”! radio-frequency (RF) coils. In our system, the existing 
(large) coil of a commercial probe is used to transmit power, and the 
probe’s tuning elements are used to fine-tune the ensemble and 
achieve impedance matching. The ensemble of the probe coil and 
the micro-coil thus acts like a rotating transformer, and the rotating 
sample container becomes an active component in the NMR detec- 
tion chain. When using static micro-coils, susceptibility effects have 
been a major source of line broadening, although approaches for 
eliminating them have been considered’. In our method, these effects 
are largely eliminated simply by placing the coil together with the 
tuning elements along the magic angle and spinning them. This is 
because broadening due to isotropic susceptibility effects transforms 
under rotation like a rank 2 tensor and its time independent part 
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scales with the Legendre polynomial P,(cos@), therefore most of it 
averages to zero by positioning and spinning at the magic angle”. 

A sketch of a typical configuration for the MACS apparatus is 
shown in Fig. 1. The micro-coil is wound around the sample, result- 
ing in a filling factor close to unity. A long ceramic cylinder is tightly 
fitted inside a commercial rotor, in order to centre the sample- 
containing capillary that is placed together with a tuning chip capa- 
citor along the axis of rotation. This guarantees constant electromag- 
netic coupling while spinning. With this set-up and for a wide range 
of coil geometries, the power originally dissipated in the primary coil 
is essentially entirely dissipated in the secondary circuit (that is, the 
micro-coil), as if it were wired up with tethered connections. This 
holds in the so-called ‘over-coupling’ regime’. The device thus 
produces high RF field amplitude B, per unit current without the 
introduction of additional noise, because the ensemble is matched to 
50Q thereby leading to optimal sensitivity. This corresponds to an 
increase of the induced RF amplitude, for the same amplifier power, 
with respect to the coil of the MAS probe. This increase depends 
on the coils’ volume ratio. By virtue of the reciprocity principle’, 
an equal increase in signal-to-noise ratio (SNR) is obtained. The 
performance depends non-critically on the difference between the 
resonant frequency of the micro-coil and the Larmor frequency of 
the spins, and the enhancement is rather broadband (D.S. and J.-F.J., 
manuscript in preparation). 

Optimized solid-state NMR of mass-limited samples is currently 
performed using small diameter coils and rotors, but can still suffer 
from low filling factors and artefact signals from the probe housing 
and the rotor. Rotating micro-coils can alleviate both drawbacks, as 
illustrated in Fig. 2. Figure 2a shows the 'H spectrum recorded from a 
small, powdered sample of L-alanine using a 2.5mm outer diameter 
rotor and a 2.5mm cross-polarization MAS (CPMAS) probe. Sub- 
traction eliminates the background signal due to stator proton signals 
and residual proton signals from the rotor cap, giving the spectrum 


10mm 


Figure 1| Schematic diagram of the magic-angle coil spinning (MACS) 
insert. The sample in ordinary high-resolution solid-state NMR is placed 
inside a spinning sample holder (rotor in blue), which is pneumatically 
rotated at many thousands of revolutions per second and surrounded by 
the coil of the probe (light yellow). When the sample is too small to 
efficiently fill the rotor, the sensitivity is not optimal. In such cases a 
tuned micro-coil can be tightly wound around the sample, which is placed 
inside a glass capillary (dark yellow). A cylindrical ceramic insert (green) is 
used to keep the capillary and the tuning capacitor (red) centred while 
spinning. The scale in the figure is used only as an indication of the size of the 
rotor, as various size micro-coils and capillaries can be used (see 
Supplementary Information). Wireless coupling between the tuned circuit 
and the probe electronics generates a high RF field and enhances the 
detection sensitivity. 
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in Fig. 2b. The MACS technique was then implemented using a 
commercial 7mm CPMAS probe with a 7-mm-diameter rotor and 
a tuned micro-coil, and a powdered L-alanine sample of ~200 nl. 
Figure 2c shows the obtained proton spectrum after a 7/2 excitation 
pulse, which can be easily identified as that of L-alanine. The enhance- 
ment in RF amplitude with respect to the 7 mm coil of the surround- 
ing probe is of the order of 20, which compares well with the 
theoretical value expected from electromagnetic calculations. 
Further experiments to quantify the performance of MACS were 
conducted (see Supplementary Information for details), with Table 
1 providing a summary of the results in terms of sensitivity. MACS 
offers a signal gain of ~8 even with respect to the 2.5mm probe. By 
virtue of the reciprocity principle, the same improvement factor 
applies to the RF field amplitude: nutation frequencies of 0.5-1 
MHz for protons were generated using 50-100 W of RF power. 

In solid-state NMR experiments, the size of the anisotropic inter- 
actions is usually of the same order of magnitude or even larger than 
the RF field used to modulate them, and this often results in in- 
complete averaging. Experiments that use dipolar and quadrupolar 
decoupling, or broadband single and multiple quantum excitation, 
could benefit from large RF fields, as the manipulation of the hamil- 
tonian could be treated more rigorously within the perturbation 
limits. The effect of the improvement in RF field amplitude offered 
by MACS has been investigated experimentally by exciting non- 
allowed multiple quantum transitions in inorganic materials (see 
Supplementary Information). 

For biological tissue samples, even extremely low spinning 
frequency” high-resolution MAS (HRMAS) eliminates susceptibility 
broadenings and yields high-resolution NMR spectra that allow 
quantification of metabolites in ‘metabolomics’ profiling studies. 
Spectra of bovine muscle tissue from a full 7 mm rotor (365 mg of 
sample) and from a 7mm rotor having a MACS insert (~0.3 mg of 
sample), under moderate spinning conditions, are shown respect- 
ively in Fig. 3a and b. The resolution in the spectrum using MACS is 
sufficient (residual line width ~0.05 parts per million, p.p.m., of the 
static magnetic field) to detect quickly and identify resonances 
assigned to triglycerides, lactate and phosphocreatine”. Even though 
the presence of spinning sidebands (not visible within the proton 
chemical shift range of Fig. 3) implies the existence of some residual 
susceptibility effects, the sensitivity of the technique is demonstrated 
by the factor of ~22.5 in RF amplitude enhancement (and therefore 
in signal-to-noise ratio). We note here that the current tendency of 


50 0 -50 50 0 -50 50 0 -50 
1H chemical shift (p.p.m.) 


Figure 2 | Sensitivity comparison on 'H (proton) MAS NMR spectra 

from small samples of powdered L-alanine. a, Spectrum acquired after a 7/2 
pulse (acquisition time ~12h), using a standard 2.5 mm rotor. The sample 
(0.41 mg) was placed in the centre of the rotor. Most of this signal comes 
from the housing and rotor background, and can be eliminated. 

b, Spectrum after background subtraction (total acquisition time ~24h), 
using the same set-up as in a (spinning sidebands are labelled with 
asterisks). The residual signal corresponds to the L-alanine signal plus some 
artefacts and has a signal-to-noise ratio (SNR) of 270. ¢, Spectrum obtained 
using the MACS technique on a 7 mm rotor, using 0.15 mg of sample 
(acquisition time ~8.5 min), after a 7/2 pulse. The signal (SNR ~ 110) 
comes only from the L-alanine sample and contains a centre band (shown 
expanded in the inset) and all spinning sidebands at multiples of the 
spinning frequency. 
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Table 1| Sensitivity comparisons between conventional CPMAS probe sizes and MACS 


7mm system 


4mm system 


2.5mm system MACS (750 uum coil) 


SNR per scan per sample mass (mg +) 2.44 
By/\P (mT W”?) 0.134 
Relative MACS SNR enhancement 18.7 


0.166 


3.22 5.86 45.8 
0.306 2.03 
14.2 78 1.0 


Data were acquired on small quantities of powdered L-alanine (see Supplementary Information). As expected, the smallest coil detector offers better signal-noise ratio (SNR) per sample mass and 
comes with a relative gain in RF amplitude. This confirms the reciprocity principle, as expressed in the second line of the table (B,/,/P, where B, is the RF magnetic field for a given transmitter power P, 
represents the probe efficiency, and is proportional to the SNR of a probe’). The last line shows the SNR enhancement the MACS approach offers in the case of ~200-nI-volume samples. 


using very high magnetic fields to improve sensitivity is compatible 
with a reduction in sample size, but it comes at the expense of larger 
susceptibility effects because broadening increases with magnetic 
field; MACS eliminates simultaneously effects coming both from 
the intrinsic sample and coil susceptibilities. 

In cases where the sample must be confined, such as when studying 
radioactive or biologically harmful materials, the filling factor is 
inherently low and cannot be enhanced simply by using static 
micro-coils'’. For example, in the first—and to the best of our know- 
ledge, only—MAS NMR study on radioactive samples”, plutonium- 
containing ceramics were placed in triple barrier rotors to confine the 
emitted radiation and limit the risks in the case of rotor failure. The 
sample was positioned inside a ceramic container, which fitted inside 
a soft PTFE cylinder, with the ensemble then placed inside the rotor. 
This reduces the filling factor and hence the sensitivity of detection by 
a factor of roughly 4. To show that the MACS technique can retrieve 
this lost sensitivity, we used a rotating coil of 2.3 mm diameter (it was 
not really a micro-coil), fitted inside the innermost ceramic barrier, 
surrounded by the PTFE barrier and the 7 mm rotor. As a (dummy) 
sample, a cylinder made of Pyrex glass was used. Spectra with com- 
parable signal-to-noise ratios were recorded in considerably shorter 
acquisition times and the RF field amplitude was largely enhanced 
(see Fig. 4). MACS could thus facilitate future studies of radioactive 
materials, for it requires less experimental time or much smaller 


1H chemical shift (p.p.m.) 


Figure 3 | Proton NMR spectra of bovine muscle tissue. a, The high- 
resolution spectrum from ~365 mg of bovine muscle tissue using a full 

7 mm rotor spinning at 3,000 Hz was acquired in 33 s (8 scans). Pre- 
saturation of the water resonance (4.7 p.p.m.) was achieved using a 2s 
irradiation before the 1/2 hard pulse. b, High-resolution spectrum from 
~0.3 mg of bovine muscle tissue using the MACS insert under the same 
experimental conditions. The susceptibility broadening is distributed in 
spinning sidebands outside the range of the proton chemical shifts and the 
line width is of the order of 0.05 p.p.m. for all resonances (no deuterium lock 
was used). This width is mainly attributed to temperature gradients 
(temperature difference less than 3-4 °C). Partial assignments can be made 
on the basis of the literature” (see Supplementary Information). In the 
absence of sample spinning, the susceptibility broadening hides most of the 
isotropic chemical shift information, as seen from the spectrum (inset) of the 
static sample used in a. 
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quantities of samples and thus minimizes radiation activity, thereby 
rendering such studies more accessible. 

Although MACS is still at an early stage of development, our 
results show it to be a promising new way for obtaining highly sens- 
itive spectra from nano- to pico-litre volume samples”, as miniatur- 
ization is advantageous for stable spinning and the amplification of 
RF field and sensitivity. Centrifugation effects commonly present in 
HRMAS of liquids and cell suspensions” are expected to be minimal 
in the MACS implementation because of the small capillary diameter. 
The high sensitivity offered by our method could potentially reduce 
the need for large biopsy samples, and HRMAS of a few cells should 
become possible. Furthermore, metabolomics studies could benefit 
from this new possibility of studying small frozen samples (temper- 
ature close to 0 °C), because freezing minimizes sample degradation, 
and the sample preparation does not require extraction and homo- 
genization™. State-of-the art technologies in cryogenic MAS probe 
design could, in principle, be combined with MACS, as they offer the 
possibility to over-couple even smaller micro-coils and therefore 
further enhance the sensitivity of the technique. High amplitude 
RF fields of the order of MHz could change the landscape of modern 
solid-state NMR in areas such as dipolar decoupling, excitation in 
paramagnetic systems, and overtone spectroscopy, while using low 
RF power amplifiers'’. Applications of MACS to high-pressure 
NMR and micro-imaging promise a significant advance over state- 
of-the-art technology, and provide a framework for new medical, 
industrial and chemical analysis. Further developments in miniatur- 
ization technology (coil and capacitor lithography”), together with 
variations in resonator geometry (RF shimming) and resonance 
mode (self-resonance), should eliminate any effects related to eddy 
currents, render the implementation more practical especially for 
smaller and faster spinning rotors, and further enhance the perform- 
ance of MACS. 


heals 
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Figure 4 | 7°Si MAS NMR spectra of Pyrex. Sample spinning at the magic 
angle eliminates all chemical shift anisotropy, and reveals the distribution of 
the isotropic chemical shifts for distorted tetrahedral Q™ sites. a, Spectrum 
acquired from a Pyrex glass sample confined inside a triple barrier rotor, 
simulating the conditions used in the acquisition of high-resolution MAS 
spectra from radioactive samples. The duration of the experiment was 1 day 
21 hours, and the measured SNR was ~33. b, Spectrum obtained using the 
MACS technique, having an SNR of ~30 after 2.8 hours of acquisition under 
the same experimental conditions as in a. 
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METHODS SUMMARY 


In most experiments involving rotating micro-coils, the samples were fitted 
inside a quartz capillary. The tuned micro-coils were made of ordinary coated 
copper thin wire (50-80 um in diameter) manually wound around the capillary, 
and soldered to a chip capacitor. The resonant frequency and the quality factor of 
the tuned micro-coil were measured. Once the rotor was placed inside the MAS 
probe, depending on the coupling regime, one or two resonances might be visible 
on the wobbling curve’. Fine tuning to the exact value of the Larmor frequency 
and matching of the ensemble to 50Q was performed using the tuning and 
matching elements of the probe. A hollow aluminium nitride (Shapal-M) cylin- 
der was used to centre the micro-coil-supporting capillary inside commercial 
rotors and allow smooth spinning. Shapal-M has one of the highest thermal 
conductivity coefficients, is very rigid and the geometry of the insert is optimized 
to evacuate almost entirely the heat generated by eddy currents of the rotating 
coil. The measurement and analysis of power dissipation is detailed in Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Rotating coils hardware. In most experiments involving rotating micro-coils, 
the microscopic samples were fitted inside a quartz capillary. The tuned micro- 
coils were made of ordinary coated copper thin wire (50-80 um in diameter) 
manually wound around the capillary, soldered to a chip capacitor. A hollow 
Shapal-M cylinder, having an outer diameter (0.d.) of 5.5mm, a length of 
14mm and an inner diameter (i.d.) of 1.1 mm, was used to centre the micro- 
coil-supporting capillary inside commercial 7 mm o.d. rotors. Shapal-M has a 
very high thermal conductivity coefficient (>90 W m7 | K™'), is very rigid, and 
the geometry of the insert is optimized to evacuate almost entirely the heat 
generated by eddy currents of the rotating coil. The take off, landing and stable 
spinning up to 7 kHz within 5 Hz were controlled using the automatic setting of 
the pneumatic MAS unit. For the confined-sample MACS experiments, we used 
a home-made triple barrier rotor system. The innermost container, made of 
Shapal-M, was 4.0mm and 2.5mm in o.d. and i.d., respectively. The second 
barrier was made of PTFE and had an o.d. of 5.6 mm and an i.d. of 4.0 mm. The 
zirconia rotor served as the third barrier, and the coil was directly wound on the 
cylindrical Pyrex sample. 

Tuning protocol. Using approximate formulas for the inductance of the coil 
allowed us to estimate the capacitor values in order to tune the LC circuit close 
(within 1%) to the Larmor frequency of the spins. The resonant frequency and 
the quality factor of the tuned micro-coil were measured using a spectrum 
analyser and sniffer coils. Once the rotor was placed inside the MAS probe, 
depending on the coupling regime, one or two resonances might be visible on 
the wobbling curve’’. Fine tuning of one of them to the exact value of the Larmor 
frequency and matching of the ensemble to 50 Q was performed using the tuning 
and matching elements of the commercial probe. 

Coupling regimes. In practical terms, the ‘overcoupling’ regime can be achieved 
when the volumes of the primary (large; V,) and secondary (micro; V>) coils 
satisfy: V2 > Vi/(QiQ>), where Q; and Q, are their respective quality factors. In 
this case, and if all the RF power dissipation occurs in the detection coil, the field 
and the SNR enhancements are of the order of \[(V,Q:)/(V2Q,)]. The case of 
‘undercoupling’ is not optimal for signal sensitivity, but it can also give signifi- 
cant enhancement (of the order of Q,) with respect to classical detection using 
the primary coil (but not as large as VI(V,Q))/(V2Q,)])- 

Power dissipation. Heating effects due to eddy (or Foucault) currents in the coil 
have been observed and measured using a sample of lead nitrate. Stabilization 
cylinders made of KelF and Shapal-M were tested and their thermal properties 
compared. The measured values of the *”’Pb isotropic chemical shift indicate an 
increase in sample temperature of less than 30°C when the powder sample is 
surrounded by a 900 um o.d. copper micro-coil, a thermal conductivity paste, 
and a Shapal-M stabilization insert inside a rotor spinning at 7 kHz in a 11.7T 
magnet (10°C in temperature was recorded using a standard MAS rotor under 
the same spinning conditions, see Supplementary Information). The calculated 
power dissipation using a finite elements model (Flux3D, Cedrat Technologies) 
is ~100 mW. These effects increase with the strength of the static magnetic field 
and the spinning frequency. They also depend strongly on the wire diameter, and 
are expected to become negligible as the coil and wire sizes decrease. We are 
currently investigating better heat sinks for heat diffusion inside the rotor, and 
we are exploring alternatives to diminish eddy current effects (preliminary 
experiments using micro-coils made of so-called Litz wire proved to offer 
significant advantages). 
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Low Atlantic hurricane activity in the 1970s and 
1980s compared to the past 270 years 


Johan Nyberg’, Bjorn A. Malmgren’, Amos Winter®, Mark R. Jury*, K. Halimeda Kilbourne’’® & Terrence M. Quinn?’”® 


Hurricane activity in the North Atlantic Ocean has increased sig- 
nificantly since 1995 (refs 1, 2). This trend has been attributed to 
both anthropogenically induced climate change’ and natural vari- 
ability’, but the primary cause remains uncertain. Changes in the 
frequency and intensity of hurricanes in the past can provide 
insights into the factors that influence hurricane activity, but reli- 
able observations of hurricane activity in the North Atlantic only 
cover the past few decades”. Here we construct a record of the 
frequency of major Atlantic hurricanes over the past 270 years 
using proxy records of vertical wind shear and sea surface temper- 
ature (the main controls on the formation of major hurricanes 
in this region’*>) from corals and a marine sediment core. The 
record indicates that the average frequency of major hurricanes 
decreased gradually from the 1760s until the early 1990s, reaching 
anomalously low values during the 1970s and 1980s. Furthermore, 
the phase of enhanced hurricane activity since 1995 is not unusual 
compared to other periods of high hurricane activity in the record 
and thus appears to represent a recovery to normal hurricane 
activity, rather than a direct response to increasing sea surface 
temperature. Comparison of the record with a reconstruction of 
vertical wind shear indicates that variability in this parameter 
primarily controlled the frequency of major hurricanes in the 
Atlantic over the past 270 years, suggesting that changes in the 
magnitude of vertical wind shear will have a significant influence 
on future hurricane activity. 

The years from 1995 to 2005 experienced an average of 4.1 major 
Atlantic hurricanes (category 3 to 5) per year, while the years 1971 to 
1994 experienced an average of 1.5 major hurricanes per year*. A 
major hurricane is defined as a tropical cyclone with maximum sus- 
tained (1 minute) surface (measured 10 m above the surface) winds of 
=50ms |. This increase in major hurricane frequency is thought to 
be caused by weaker vertical wind shear | V,| and warmer sea surface 
temperatures (SSTs) in the tropical and subtropical Atlantic’*, which 
some studies have attributed to a natural multidecadal variability in 
the thermohaline circulation', termed the Atlantic Multidecadal 
Oscillation (AMO)°, and other studies to anthropogenic climate 
change’. Little information exists about the effects of the AMO vari- 
ability on changes in tropical Atlantic climate and magnitudes of 
hurricane activity. Although hurricane intensity and destructiveness 
may increase with increasing global mean temperatures””, the effect of 
climate warming on hurricane frequency is poorly known*. Further- 
more, it is possible that hurricane activity responds to changes in other 
external forcings, such as solar activity? and aerosol loading’. The 
reliable observation record of hurricane activity over the North 
Atlantic exists only from 1944 (ref. 2) with continual satellite coverage 
from 1966 (ref. 10), providing a temporally limited perspective on 


these issues. The existing historical records of hurricane activity are 
based on archives documenting primarily US landfalls’’. 

The Main Development Region (MDR)'* is where 85% of all 
Atlantic major hurricanes and 60% of all non-major hurricanes (33 
to 50ms_') and tropical storms (18 to 32 ms !) are formed. The 
MDR is an area westward of Africa across the tropical Atlantic and 
Caribbean Sea at latitudes between 10° and 20° N (Figs 1 and 2). Here 
tropical cyclones are formed when easterly atmospheric waves prop- 
agate from Africa across the tropical North Atlantic’*. Because the 
number of easterly waves are fairly constant from year to year’’'’, the 
dominant factors for major hurricane formation are the magnitude 
of | VL and SSTs in the MDR during August to October, when almost 
all major hurricanes are formed’*>'*. Local vertical wind shear 
|V,|>~8ms_ | (for example, upper winds opposed to lower easterly 
trade winds) is in general unfavourable for the formation of tropical 
cyclones owing to distortion of the vertical structure of the convective 
cloud cells'*. The vertically tilted cloud cells are limited in their capa- 
city to provide energy to the storm. 

Local SSTs of ~27 °C or more and a warm mixed layer down to a 
depth of ~50 m are also considered necessary for major hurricane 
development’. Empirical studies indicate, however, that warmer 
local SSTs are not central in the formation of major hurricanes’. 
Regional Atlantic SSTs appear to be more important for the forma- 
tion of major hurricanes through their interdependence with | V,| in 
the MDR"™, which may arise from interactions with the Pacific El 
Nifio Southern Oscillation (ENSO)'*. Warmer SSTs in the North 
Atlantic region coincide with reduced |V,| in the MDR and vice 
versa’, 

Comparisons between the hurricane index* and observed zonal 
winds show significant positive (westerly) correlation values from 
the surface up to 1.5 km (850 hPa) over latitudes ~10—20° N (Fig. 1), 
while significant negative (easterly) correlation values exist around 
12km (200hPa) in these latitudes. Consequently, by using proxies 
of |V,| (trade wind strength) and SST anomalies in the MDR, robust 
longer-term estimations of major hurricane activity are possible. 

To estimate | V,| we use four luminescence intensity records for the 
months August to October derived from coral cores"* retrieved in the 
Caribbean off the southern Dominican Republic, south-western 
Puerto Rico and Mona Island together with one annual abundance 
record of the planktonic foraminifer Globigerina bulloides in a well- 
dated ('4C and *!°Pb) sediment core!” from the Cariaco basin in the 
southern Caribbean Sea (Fig. 2a—d). 

The significant relationships shown in Fig. 2a—d (see also Sup- 
plementary Information) are explained as follows: luminescence 
intensity in corals reflects the degree of terrestrial runoff, which is 
controlled by the amount of precipitation'®. Decreased precipitation 


"Geological Survey of Sweden, Box 670, SE-751 28 Uppsala, Sweden. *Department of Earth Sciences, Géteborg University, Box 460, SE-405 30 Goteborg, Sweden. *Department of 
Marine Sciences, University of Puerto Rico, PO Box 9013. *Department of Physics, University of Puerto Rico, PO Box 9016, PR 00681-9013, Mayagiiez, Puerto Rico. °College of Marine 
Science, University of South Florida, 140, St Petersburg, Florida 33707, USA. °Physical Sicences Division R/PSD1, NOAA, Earth System Research Laboratory, 325 Broadway, Boulder, 
Colorado 80305, USA. ’Department of Geological Sciences, Jackson School of Geosciences, University of Texas at Austin, 1 University Station C1100, Austin, Texas 78712, USA. 
8institute for Geophysics, J. J. Pickle Research Campus, University of Texas at Austin, 10100 Burnet Road, Austin, Texas 78758, USA. 
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Figure 1| Simultaneous correlation of zonal winds in August-September 
(1955-2004) with the unsmoothed Atlantic hurricane index?. NCEP/ 
NCAR Reanalysis (285° E to 310° E). Positive (or negative) values on the 
colour scale refer to increased westerlies (or easterlies) in active hurricane 


in the northeastern Caribbean during the hurricane season is asso- 
ciated with increased trade-wind speed and high |V,| over the 
MDR’*'*? (Fig. 2). Increased trade-wind speed corresponds to 
higher sea-level pressures, enhanced sinking motion and drying 
and a more stable lower atmosphere, which results in lower precip- 
itation and a more sheared environment in the tropical Atlantic 
during the hurricane season'*®. Higher abundance of G. bulloides 
reflects more nutrient supply caused by enhanced upwelling due to 
increased trade-wind strength, which is related to high | V,| over the 
MDR’" (Fig. 2d, see also Supplementary Information). In addition, 


years. Correlation values exceeding +0.2 are significant above the 90% 
confidence level. The zonal winds are within the area marked by dashed 
lines in b. 


these wind-speed records are tightly linked to larger-scale SST anom- 
alies in the North Atlantic region'®””, which supports their robustness 
as a proxy of major hurricane activity. Figure 2a—d also displays an 
association between these four proxies and |V,| north of the MDR 
that is out of phase with |V,| within the MDR, supporting instru- 
mental observations of an out-of-phase relationship between hori- 
zontal wind shear on either side of the main track of Atlantic 
hurricanes”. 

Back propagation artificial neural networks were used to estimate 
past | V,| and major hurricane activity. The networks were trained to 
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Figure 2 | Spatial correlations between instrumentally observed vertical 
windshear (|V_,|) in August-October and the proxies used. Luminescence 
intensity during August to October in the coral cores retrieved outside Mona 
Island, 1949-1992 (a), Catalina Island, 1949-1993 (b) and La Parguera, 
1949-2000 (c). d, Annual abundance of the planktonic foraminifer 


Globigerina bulloides in the Cariaco basin, 1949-1990. The colour scale 
refers to all panels. Only statistically significant correlation values exceeding 
the 0.01 (rs <—0.4 and rs; > 0.4) and 0.001 (rs < —0.5 and rs > 0.5) levels are 
shown and referred to in the colour scale. Solid lines in a—d give the north 
and south boundaries of the MDR. 
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learn the relationships between the combined input (independent) 
proxy records and each of the two output (dependent) instrumental 
records of | V,| and number of major hurricanes, respectively (Fig. 3). 
The SST anomaly record”® was averaged between latitudes of 10 and 
25° N and longitudes of 20 to 85° W, because the region where SST 
anomalies directly affect major hurricane activity is within and 
around the MDR*”!. Although instrumentally recorded | V,|, lumin- 
escence intensity and G. bulloides show the strongest correlations 
with SST anomalies in the North Atlantic region between 50 and 
60° N (refs 1, 16 and 17), these records also show statistical significant 
relationships with the instrumental SST anomaly record”°”’, aver- 
aged over the MDR, using five-year running averages (Fig. 3). This 
demonstrates the physical link between SST anomalies in the MDR 
and |V,|. 

Figure 3 shows that the switch from low to high abundance of G. 
bulloides, high to low luminescence intensity, and high to low SST 
anomalies is coincident with the shift towards high-| V,| conditions 
and decreased major hurricane activity in the period 1965-1971 (refs 
1, 2). The shift back towards low | V,| conditions and increased major 
hurricane activity starts around 1987-1988, but is suppressed by the 
long-lasting (1990-1995) El Nifio event’ (Fig. 3). The record of 
G. bulloides (until 1990) reflects the shift around 1987-1988, while 
the luminescence-intensity records reflect both the El Nifo event 
(1990-1995) and the evident shift towards increased major hurricane 
activity around 1995 (ref. 1 and Fig. 3). 

The downward trend in the frequency of major hurricanes” from 
the 1940s to 1970s in the reliable observation record is matched by the 
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Figure 3 | The reconstructed major hurricane activity and |V,| series back 
to 1730. Also shown are the reliable observation records back to 1944 (ref. 2) 
and 1949 (ref. 19), respectively, the historical hurricane record back to 1851 
(refs 10, 11), the ERSST v2 data” averaged over 10 to 26° N and 20 to 86° W 
during August—October back to 1854, zonal wind speed data centred at 11° N 
and 65° W (ref. 30) back to 1890 together with the SST anomalies”, 
luminescence intensities, and abundance of G. bulloides upon which the 
reconstructions were based. The dashed lines indicate 95% confidence 
intervals for estimated numbers of hurricanes and | V,| values. All data are 
smoothed with a five-year running average. The ‘master’ luminescence curve 
is developed by averaging standardised luminescence intensity values during 
Aug-—Oct in all of the coral cores for each year. 
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reconstruction (Fig. 3). The reconstruction also follows the variabil- 
ity during the 19th and 20th centuries of US East Coast hurricane 
landfalls**"°*?*, with a quiet period during the 1850s to the late 
1860s, an active period from the 1870s to the 1890s, a quiescent 
period to 1926, and an active phase from 1926 to 1970. We note 
the reconstructed high activity around 1886, which is the most active 
hurricane season on record for the continental United States’’. In 
addition, the reconstructed |V,| closely follows observed annual 
zonal wind speed data in the Caribbean back to 1890, and high 
reconstructed |V,| values back to 1851 coincide with low observed 
major hurricane activity"’ (Fig. 3). The exact numbers of observed 
major Atlantic hurricanes before 1944 are less accurate owing to the 
lack of observational networks*"', which probably explains the dif- 
ference between the numbers of reconstructed and instrumentally 
observed major hurricanes (Fig. 3). 

The reconstruction shows that there have been on average ~3—3.5 
major hurricanes and a |V,| of ~8-9ms ' per year from 1730 to 
2005. A gradual downward trend is evident from an average of ~4.1 
(1755-1785) to ~1.5 major hurricanes during the late 1960s to early 
1990s, which experienced strong | V,| and few major hurricanes com- 
pared to other periods since 1730. Only the periods ~ 1730-1736, 
1793-1799, 1827-1830, 1852-1866 and 1915-1926 appear to have 
been marked by similarly low major hurricane activity and high | V,]. 
Furthermore, the current active phase (1995-2005) is unexceptional 
compared to the other high-activity periods of ~1756-1774, 1780- 
1785, 1801-1812, 1840-1850, 1873-1890 and 1928-1933 (Fig. 3), 
and appears to represent a recovery to normal hurricane activity, 
despite the increase in SST. 

Wavelet spectral analyses together with spectral analyses reveal the 
existence of significant ~8—11 and ~20-30-year cycles in the records 
(see Supplementary Information). Decadal signals in occurrences, 
formation areas, and landfalls of tropical storms and hurricanes have 
also been identified elsewhere and linked to the North Atlantic 
Oscillation?***>. 

To improve our understanding further, the derived records are 
compared with indices of the AMO*”® and total solar irradiance 
(TSI)” (Fig. 4). Reduced major hurricane activity coincides with a 
lower AMO index around 1820-1830, 1910-1920 and 1970-1990; 
enhanced activity coincides with a high index around 1750-1790, 
1870-1900 and 1930-1960 (Fig. 4). Peaks and trends of higher major 
hurricane activity concur with lower TSI, and vice versa, several times 
since 1730 (Fig. 4). Results from a general circulation model show 
that circulation changes in the upper stratosphere, induced by inter- 
actions between solar irradiance and ozone levels, may penetrate 
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Figure 4 | The reconstructed major hurricane and |V,|-series compared to 
total solar irradiance and AMO indices. The instrumental records of major 
hurricane activity and |V,| (five-year running averages) are shown from 
1944 and 1949, respectively, to 2005. TSI and |V,| have increasing values 
downwards. Shaded lines mark concurrent peaks and lows in TSI, hurricane 
activity and/or |V,|. 
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down to the troposphere, where surface winds and sea level pressures 
are affected”*. In addition, the general circulation model shows a 
forced shift towards decreased sea level pressure in the subtropical 
Atlantic during reduced TSI, which would result in weaker easterly 
trade winds, weaker |V,| and higher major hurricane activity, and 
thus explains some of our observations. 

Our results suggest that the frequency of major hurricanes since 
1995 is not unusual, indicating that increases in SST during the past 
270 years have been offset by increased | V,|, which suppresses major 
hurricanes. A more rapid warming of the atmosphere relative to the 
ocean could have caused the anomalous calm period between the 
1970s and 1990s. Air temperatures near the level of the trade-wind 
inversion (1.5 km) as well as 10 m air temperatures during the past 50 
and 100years, respectively, averaged over the Caribbean (see 
Supplementary Information), have risen faster than SSTs, indicating 
an enhanced stability of the lower atmosphere and a strengthening of 
the trade-wind inversion that reduces the influence of thermodyn- 
amic energy from a warmer ocean”’. This physical mechanism leads 
to enhanced subsidence, trade-wind strength and | V,| in the MDR”. 
The reconstructed |V,| series may indicate that this trend has 
occurred over a longer period. 

The future possibility of lower | V,| combined with increased SSTs 
in the MDR (Figs 3 and 4) may result in longer storm lifetimes and 
more moist enthalpy to power developing tropical cyclones, causing 
higher hurricane frequencies and greater storm intensities. 


METHODS SUMMARY 


The correlation coefficients are computed using the Spearman rank correlation 
(rs) owing to the presumed lack of non-normality of the data series. The stat- 
istical significances are judged using a two-tailed significance test. The vertical 
wind shear V, is calculated as the vector difference between the 200 and 850 mbar 
climatological winds’ averaged for August-October from the NCEP/NCAR 
Reanalysis gridded (2.5°X2.5° interval) monthly mean wind data set over the 
MDR region’. A single ‘master’ coral luminescence intensity curve was 
developed by averaging standardized luminescence intensity values in all of 
the four cores for each year. 

The Trajan 6.0 Neural Network software was used for the autonomous learn- 
ing of the relationships between the independent variables (master luminescence 
intensity, number of G. bulloides and SSTs) and dependent variables (number of 
instrumentally observed major hurricanes and vertical windshear). The data sets 
were split so that ~67% of the samples was used in the training set, ~20% in the 
selection set, and the remaining samples in the test set. Each of the reconstructed 
time series was derived from ensembles constituting ten independent networks. 
The prediction errors—the root-mean-square errors of predictions—were com- 
puted for each network as the square root of the sum of the squared differences 
between the observed and predicted values divided by the number of samples in 
the test sets using different training, selection and test cases. The 95% confidence 
intervals for the estimated values were then based on these ten different runs. 

|V,| and major hurricane activity are associated primarily with the large-scale 
decadal SST fluctuations in the Atlantic'’. Therefore, to emphasize lower- 
frequency variations, a five-year running average was used when training the 
networks. Significance tests account for the additional auto-correlation imposed 
by the filter. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Vertical windshear. The correlation coefficient (1;) between the observed num- 
ber of major hurricanes and | V,| averaged over the whole MDR during August to 
October is -0.76 (P < 0.001; time interval 1949-2003; five-year moving average). 
Our proxies respond to surface (mainly easterly) winds that are related to this 
zonal overturning Walker circulation. rs between the instrumental observed 
annual wind speed” shown (Fig. 3) and instrumental observed vertical wind- 
shear averaged over MDR in August-October from 1949 to 1992 is 0.90. rs 
between the instrumental observed annual windspeed and the estimated V, 
derived from the neural network algorithm from 1890 to 1990 is 0.85. 

Coral records. Luminescence and reflectivity were measured in ~4-mm-thick 
slices, dried, whitened and cleaned of organic material and adherent contami- 
nants'®?!, and cut parallel to the growth axis in massive Montastraea faveolata 
skeletons. The core outside Catalina Island, southeastern Dominican Republic 
was drilled in March 1999 and spans back to 1847, and the core outside Mona 
Island was drilled in May 1998 and spans back to 1684 (ref. 16). The two cores 
outside La Parguera, southwestern Puerto Rico, were drilled in February 1998 
and 2004, respectively. Four U/Th dates retrieved at the University of Minnesota, 
together with density-band counting, demonstrate that the first core represents 
ages ranging from 1768 to 1957. The second core spans back to 1870 on the basis 
of density-band counting. 

The luminescence and reflectivity of the coral slices were measured in a plate 
reader attached to a Perkin-Elmer Model LS 50B luminescence spectrometer. 
The plate scan speed was set to 30 mm min and the excitation and emission slit 
widths to 2.5 mm. The measurements were made every 0.1 mm along the growth 
axis through a lamp with a diameter of 1mm. An excitation wavelength of 
390 nm and an emission wavelength of 490 nm were chosen'®*!. The relative 
luminescence was calculated according to ref. 31, in which measurements of 
luminescence and reflectivity of coral slices are corrected using measurements 
of background and calcium carbonate standards. For the background standard a 
black plate with roughened surface was used on which the coral slices were 
laid down during measurements. The calcium carbonate standard used was 
Suprapur CaCO; 99.95 (Merck, Darmstadt, Germany). 

Luminescence and reflectance were measured for both the background and 

calcium carbonate standards at the start and end of each luminescence and 
reflectance run. Beginning and end measurements were required to be within 
2% of each other. Luminescence and reflectivity were measured at the same 
points along 3 to 6 different columellas (growth axis) through the coral cores 
avoiding visible gaps and holes. Similar results were obtained and the averaging 
results from the luminescence profiles were used. 
Coral chronology. The luminescence profiles were converted to time series by 
setting annual luminescence minima to occur in summer. This is the approx- 
imate time when high-density bands precipitate in Montastraea faveolata in this 
region”. August, September and October for each year was then determined by 
measuring the extension rate and assuming a linear relationship between time 
and extension rate for the given year. The average relative luminescence in the 
August—October interval is used. Thus, this data refers to a seasonally specific 
relationship between trade winds (for example, | V,|) and rainfall. 

The youngest 30-40 mm of growth in the coral cores were omitted owing to 

the risk of contamination from the upper 6-8-mm-thick, anomalously high 
luminescence tissue layer, which is possibly associated with oxidized organics. 
Cross-dating characteristic luminescent bands between the different cores was 
applied*’. After correcting ages the annual luminescence intensity values were 
standardized in each core. The maximum value was assigned to be 3 and the 
minimum value —3. 
Multiproxy calibrations and reconstructions. Back propagation artificial 
neural networks were used for the reconstructions, because the nonlinear map- 
ping of the input variables to the output variable inherent in artificial neural 
networks was found to provide better predictions than conventional linear 
regression analysis (see Supplementary Information). The correlation coeffi- 
cients (rs) between the reconstructed and instrumentally recorded |V,| and 
number of major hurricanes are 0.97 (P<0.001, 1949-1990) and 0.92 
(P<0.001, 1944-1990), respectively, and the root-mean-square errors of pre- 
dictions are 0.21 ms ' for |V,| and 0.21 for hurricanes, respectively. 

Warm-season (April-September) SSTs used in the hurricane activity and 
vertical wind shear reconstructions were retrieved from 5 X 5° grids*’. The raw 
warm-season instrumental data, which were used to calibrate temperature 
reconstructions in ref. 20, were also used here in the training of the networks 
and in the reconstructions back to 1902. A comparison between the SST data 
used and the ERSST v2 data” averaged over a slightly larger region (10 to 26° N 
and 20 to 86° W) during August—October back to 1856 yields an rs of 0.77 
(P<0.001). rs between ERSST v2, averaged over the region referred to above, 
and the instrumental | V,,|, averaged over the MDR, is —0.38 (P<0.01; 1950— 
2003). rs between ERSST v2 and number of G. bulloides is -0.43 (P< 0.001; 


nature 


1900-1990). rs between ERSST v2 and the master luminescence record is 0.28 
(P< 0.05; 1950-2000). 

No significant statistical relationships are found between the AMO index of 
ref. 26 and the major hurricane and | V,| records (1730-1985). The rs between the 
AMO index of ref. 6 and the major hurricane record is 0.32 (P< 0.001; 1858- 
2003) and that between the AMO index of ref. 6 and |V,| is —0.22 (P< 0.01; 
1858-2003). The rs between TSI and the major hurricane record is —0.37 
(P<0.001; 1730-2003) and that between TSI and |V,| is 0.44 (P<0.001; 
1730-2003) using the instrumental major hurricane observations since 1944 
and |V,| since 1949 (Fig. 4). Observational bias adjustments are taken into 
account by using 52ms | asa threshold for major hurricanes during the time 
period 1944-1969. 
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Boron and oxygen isotope evidence for recycling of 
subducted components over the past 2.5 Gyr 


Simon Turner’, Sonia Tonarini’, Ilya Bindeman°+, William P. Leeman* & Bruce F. Schaefer” 


Evidence for the deep recycling of surficial materials through the 
Earth’s mantle and their antiquity has long been sought to under- 
stand the role of subducting plates and plumes in mantle convec- 
tion. Radiogenic isotope evidence for such recycling remains 
equivocal because the age and location of parent—daughter frac- 
tionation are not known. Conversely, while stable isotopes can 
provide irrefutable evidence for low-temperature fractionation, 
their range in most unaltered oceanic basalts is limited and the 
age of any variation is unconstrained. Here we show that 5'°O 
ratios in basalts from the Azores are often lower than in pristine 
mantle. This, combined with increased Nb/B ratios and a large 
range in 5''B ratios, provides compelling evidence for the recyc- 
ling of materials that had undergone fractionation near the Earth’s 
surface. Moreover, 5''B is negatively correlated with '*’Os/'**Os 
ratios, which extend to subchondritic values’, constraining the age 
of the high Nb/B, ''B-enriched endmember to be more than 
2.5 billion years (Gyr) old. We infer this component to be melt- 
and fluid-depleted lithospheric mantle from a subducted oceanic 
plate, whereas other Azores basalts contain a contribution from 
~3-Gyr-old melt-enriched basalt”. We conclude that both compo- 
nents are most probably derived from an Archaean oceanic plate 
that was subducted, arguably into the deep mantle, where it was 
stored until thermal buoyancy caused it to rise beneath the Azores 
islands ~3 Gyr later. 

The dynamics of the Earth reflect its internal heat but the nature 
and timescales of mantle convection remain poorly constrained. 
Over the past decade tomography data have provided spectacular 
images of seismically fast material inferred to be cool zones of down- 
welling associated with subducting plates and suggest that these 
can extend beyond the 670 km discontinuity into the deep mantle, 
ponding at the core-mantle boundary’. Conversely, a significant 


Table 1| 5"B and 5'%0 data for Azores basalts 


component of return flow is associated with mantle plumes, many 
of which, including the Azores, appear to rise from the core—mantle 
boundary*. Thus, there has been much interest in developing inde- 
pendent evidence for entrainment of subducted material from the 
composition of ocean island basalts erupted above plumes. 
Radiogenic isotopes have long been used in this search because of 
their potential to constrain the timescales of recycling and many 
ocean island basalts have indeed been found to have signatures dis- 
tinct from those of mid-ocean ridge basalts (MORB)° that sample the 
upper mantle. However, although such signals undoubtedly reflect 
the time-integrated effects of fractionated parent—daughter element 
ratios, the age and extent of this fractionation can rarely be decon- 
volved. Even if this is possible, the parent—daughter fractionation is 
not restricted to processes occurring near the Earth’s surface and 
could instead reflect intra-mantle metasomatism®”. In contrast, frac- 
tionation of isotopes of light elements such as O, B and Li results from 
low-temperature processes near the Earth’s surface, and significant 
variations in the stable isotope ratios of MORB and ocean island 
basalts could provide unambiguous evidence for contributions from 
recycled material*. However, the range of O and B isotope ratios 
observed in MORB and ocean island basalts has generally been rather 
restricted and observed variations are often attributed to shallow- 
level assimilation of altered oceanic crust*"“. Furthermore, stable 
isotopes cannot be used to constrain the timescales of recycling. 
Recent Os isotope analyses found that seven basalts from the 
Azores had subchondritic Os isotope ratios and here we supplement 
those data with new subchondritic Os data from a picrite from Faial 
which has 14% MgO, 0.117 p.p.b. Os and a '*’Os/'*8Os ratio of 
0.12559. These require a contribution from a component which must 
be at least 2.5 Gyr in age’ and in Table 1 we report the first B and O 
isotope data from a subset of these well characterized samples'”'®. 


Sample number Island MgO (%) Nb (p.p.m.) B (p.p.m.) 5B (%o) +1s.e.m Number of 580 (%o) +1s.e.m n 
analyses, n 

SL Sao Miguel 7.76 70.7 5.4 —6.0 0.5 1 4.88 0.058 3 

$3 Sao Miguel 8.34 77.9 6.2 —6.8 0.1 2 Not analysed 

S10 Sao Miguel 8.33 68.8 5.0 0.5 2 5.10 0.007 2 
S19 Sao Miguel 6.38 87.5 5.9 -74 0.2 1 5.28 0.028 2 
$J26 Sao Jorge 471 90.9 5.4 =3,3 0.1 2 5.14 0.047 3 
$J30 Sao Jorge 9.70 49.1 3.1 —47 0.3 1 4.94 0.057 2 

T6 Terceira 8.70 40.8 Not analysed 4.87 0.070 3 
T18 Terceira 7.94 41.5 3.4 —6.0 0.2 2 4.88 0.030 2 

P5 Pico 9.63 37.6 24 =3'6 0.2 1 5.10 0.048 2 
P25 Pico 8.24 43.1 3.8 =35 0.3 1 5.02 0.014 2 
P29 Pico 8.16 52.4 3.7 —A41 0.2 1 5.14 0.070 3 
FCA-6 Faial 9.73 35.0 2.6 =3:3 0.5 2 5.02 0.014 2 
FCA-24 Faial 13.99 30.2 59 =21 0.4 2 5.14 0.014 3 
FCP-18 Faial 7.83 46.0 44 —7.6 0.5 2 5.09 0.057 2 
'GEMOC, Department of Earth and Planetary Sciences, Macquarie University, Sydney, New South Wales 2109, Australia. “Istituto di Geoscienze e Georisorse, Via Moruzzi 1, 56147 
Pisa, Italy. 7Department of Geology and Geophysics, University of Wisconsin, 1215 West Dayton Street, Madison, Wisconsin 53706, USA. “National Science Foundation, 4201 Wilson 
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The majority were measured in duplicate or triplicate. '*O/'°O data 
were obtained by high-precision CO, laser fluorination and mass 
spectrometry” on separated olivine phenocrysts. Analyses were per- 
formed at the Universities of Wisconsin and Oregon using standards 
of garnet and mantle San Carlos olivine and similarly low 5'%0 oliv- 
ine values were obtained in both laboratories. The range of 5'°O 
values (5.14% to 4.87%) exceeds analytical error and extends below 
the value (5.2 + 0.2%o) value inferred for pristine mantle’. A com- 
parable 8'*°O range, extending down to 4.57, was also found in a 
recent study of olivine phenocrysts from Sao Miguel'*. Similarly, 
MORB glass data’® from 38-40°N along the mid-Atlantic ridge, 
across the centre of the Azores platform, range from 5.2%o to 5.7%o 
which is equivalent to 4.7%o to 5.2%o in olivine. Thus, 5'°O values 
below that of pristine mantle do appear to occur in the Azores’® 
(Fig. 1). Such low 8'°O values are characteristic of the layer-3 gabbros 
and altered peridotites from the oceanic lithosphere (5'*O = 3-5) 
but are unlike altered oceanic crust (5'*O = 5-9) or pelagic sedi- 
ments (5'8O = 15-25)°. Subcontinental lithospheric mantle is 
expected to have MORB-like 5'°O and so the presence of low 8'°O 
supports models of recycling of oceanic rather than subcontinental 
lithospheric mantle’. 

The ''B/'°B isotope ratios and B concentrations were measured by 
thermal ionization mass spectrometry at Pisa using the di-caesium, 
meta-borate method” following alkali carbonate fusion and ion 
exchange separation”. Sample S10, which had visible evidence for 
alteration along cracks, yielded a 8''B ratio of +5.02, providing 
strong evidence for seawater (8''B = +39) contamination for this 
one sample. For the remaining samples, 5''B data show a large range 
from —3.3 to —7.6%o. B concentrations vary from 2.4 to 6.2 p.p.m. 
and the samples with the lowest B concentrations have the highest B 
isotope ratios. Also, with the exception of one sample (FCP-18), B 
and O isotopic compositions are positively correlated, ranging from 
compositions within the normal range for oceanic basalts to low ''B 
and 8/80 (Fig. 2a). These relationships are in contrast to the effects of 
seawater contamination and thus the data are likely to reflect mag- 
matic signatures. The source of MORB has *’Sr/*°Sr ~ 0.7025, an 
average 5''B of —4.6%o (refs 11-13) and Nb/B ~ 3 (ref. 22), whereas 
the Azores samples have higher *’Sr/*°Sr and Nb/B = 5-17 (Fig. 2b, 
c). These observations, in combination with the inverse relationship 
between B and 5''B and the O isotope data, strongly suggest that the 
basalts sample a source that has been modified. The O and B isotope 
data implicate recycled material that had undergone fractionation at 
relatively low temperatures in the near-surface environment. 
Conversely, the absence of any strongly elevated 5''B and 5'%O ratios, 
in conjunction with the observation that those samples with the 
highest 87r/°°Sr also have the lowest 5!°O ratios, indicates that the 
magmas did not significantly interact with the local mid-Atlantic 
oceanic crust through which they ascended. 

The nature of the recycled material is explored further on a plot of 
Nb/B versus 5''B (Fig. 2b). Nb and B have very similar partitioning 
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Figure 1| Oxygen isotope variation across the Azores platform. Plot of 
880 (+10) versus latitude. Data from this study, and refs 10 and 18. 
Vectors indicate the sense of displacement during hydrothermal alteration’. 
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behaviour in mantle minerals and are unlikely to be fractionated 
significantly during melting or crystallization?””’. Instead, B is 
strongly fluid-mobile compared with Nb’**”* and so Nb/B is mainly 
sensitive to fluid transfer and/or mixing processes, which will be 
linear on Fig. 2b. MORB and their source have a Nb/B ratio of 
3-3.5 (refs 22, 25) whereas the Nb/B ratios in the Azores and many 
other ocean island basalts are all significantly higher than this 
(Fig. 2b), suggesting a source strongly depleted in B by fluid 
removal?***, Fluids preferentially mobilize ''B in the low-temper- 
ature environment” and so the source of the basalts must have had 
higher 5''B than their measured values before fluid extraction of B. 
This suggests that their source originally had 5''B > —2, which is 
similar to that of oceanic lithosphere that has been altered by inter- 
action with sea water. Subsequent fluid loss, such as that attending 
subduction, would raise the Nb/B ratios and lower 8"'B; that is, ina 
manner consistent with the observed decrease in 8''B downward 
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Figure 2 | Variation of B isotopes with other geochemical indices in the 
Azores. Plots of 8''B versus 5'°O (a), Nb/B (b) and ®’Sr/*°Sr (c). Vectors 
indicate the effects of fluid loss’. Average MORB value was compiled from 
refs 11-13, 22, 25 and 30 and the subducted sediment value is taken from ref. 
31. Sample numbers in a refer to Table 1. Open symbols identify data from 
Sao Miguel that have been argued to have increased °’Sr/*°Sr owing to 
recycling of ~3 Ga oceanic basalts’. 
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from altered oceanic crust into the underlying gabbros and perido- 
tites in oceanic lithosphere”® or decreasing 5''B in volcanic rocks 
from cross-arc transects’. This could produce a component with 
5''B~—3 and Nb/B~ 17 observed in our samples on Fig. 2b. 
Other Azores basalts, especially those from Sao Miguel, which trend 
towards lower 5''B and Nb/B in Fig. 2b, could reflect involvement of 
recycled oceanic crust’. 

Although the B and O isotope data unambiguously seem to implic- 
ate the presence of recycled components in the Azores plume they do 
not by themselves constrain the age of these components. However, 
Fig. 3 shows that there are broad negative correlations of B concen- 
tration, Nb/B and 8''B with '®’Os/'®*Os and the observation that the 
high-Nb/B endmember extends to subchondritic Os isotope ratios 
uniquely constrains the age of this endmember to exceed 2.5 Gyr (ref. 
1). Furthermore, the Sao Miguel rocks bearing evidence for involve- 
ment of recycled oceanic crust have high '*’Os/'*8O and °’Sr/*°Sr 
ratios and recent modelling of Hf-Nd isotope systematics suggests 
that this component is recycled, melt-enriched oceanic crust which 
could be as ancient as 3 Gyr (ref. 2). These combined observations 
provide compelling evidence for both the age and origin of material 
in a mantle plume. 
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Figure 3 | Variation of Os isotopes with other geochemical indices in the 
Azores. Plots of '*7Os/'*8Os versus B concentration (a), Nb/B (b) and 8''B 
(c). Open symbols identify data from Sao Miguel. 
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Although the multi-stage, multi-component model precludes a 
unique quantitative treatment, the combined data may be inter- 
preted in the following way. Large degree (=20%) partial melting 
to produce Archaean oceanic crust also formed a refractory residual 
lithospheric mantle that was sufficiently depleted in Re to have a 
subchondritic Re/Os ratio. Hydrothermal or other surficial processes 
resulted in lowering of 8'*O and enrichment of B and 5''B but, 
during later subduction, fluid loss from the lower gabbro and peri- 
dotite sections of the plate raised Nb/B ratios and lowered 5''B. 
Long-term storage of this material in the mantle resulted in the 
development of subchondritic Os isotopes while overlying, melt- 
enriched basalts developed their distinctive Hf-Nd isotope character- 
istics’ and elevated '*’Os/'**O. Later entrainment in a mantle plume 
brought these materials back to the shallow mantle, where decom- 
pression melting produced basaltic magmas that carry the integrated 
Os-O-B signal. Thus, the Azores basalts contain evidence for con- 
tributions from the two main components ofan ancient oceanic plate 
and the broad symmetry and length scale observed in the isotope 
data' suggests that this component is intrinsic to the plume over 
10-100 km. However, numerical fluid dynamic models suggest that 
subducted materials that remain in the mantle convection system will 
become highly attenuated and that the timescale for circulation 
through the mantle is likely to be of the order of 500 million years 
rather than many billions of years**”°. In contrast, recent seismic 
results suggest that some subducted plates may pile up at the core— 
mantle boundary’, where they could remain stored for much longer 
periods of time. Seismic tomography suggests that the Azores plume 
is ascending from the core—mantle boundary* and so we suggest that 
this plume samples recycled material that was subducted to the core— 
mantle boundary and stored there since the Archaean era. 
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Pollinator shifts drive increasingly long nectar spurs 


in columbine flowers 


Justen B. Whittall'+ & Scott A. Hodges’ 


Directional evolutionary trends have long garnered interest 
because they suggest that evolution can be predictable. However, 
the identification of the trends themselves and the underlying 
processes that may produce them have often been controversial’. 
In 1862, in explaining the exceptionally long nectar spur of 
Angraecum sesquipedale, Darwin proposed that a coevolutionary 
‘race’ had driven the directional increase in length ofa plant’s spur 
and its pollinator’s tongue’. Thus he predicted the existence of 
an exceptionally long-tongued moth. Though the discovery of 
Xanthopan morgani ssp. praedicta in 1903 with a tongue length 
of 22 cm validated Darwin’s prediction’, his ‘race’ model for the 
evolution of long-spurred flowers remains contentious*. Spurs 
may also evolve to exceptional lengths by way of pollinator shifts 
as plants adapt to a series of unrelated pollinators, each with a 
greater tongue length’. Here, using a species-level phylogeny of the 
columbine genus, Aquilegia, we show a significant evolutionary 
trend for increasing spur length during directional shifts to polli- 
nators with longer tongues. In addition, we find evidence for 
‘punctuated’ change in spur length during speciation events®, sug- 
gesting that Aquilegia nectar spurs rapidly evolve to fit adaptive 
peaks predefined by pollinator morphology. These findings show 
that evolution may proceed in predictable pathways without rever- 
sals and that change may be concentrated during speciation. 


a_ Darwin’s coevolutionary race 


The contemporary evolutionary ‘fit’ of nectar spurs and pollinator 
tongue lengths has been repeatedly demonstrated*””, suggesting that 
these traits have evolved owing to their interaction. However, there 
remains a controversy surrounding the mechanism by which this 
relationship evolves*’. Under a hypothesis first proposed by 
Darwin’ and later elaborated by Wallace’, nectar spurs and pollinator 
tongues are engaged in a one-to-one coevolutionary ‘race’. They 
suggested that, within a population, the plants with the longest nectar 
spurs have a selective advantage because their reproductive organs 
optimally contact pollinators and thus they achieve the greatest 
reproduction, whereas pollinators with the longest tongues have a 
selective advantage because they obtain the largest food reward 
(Fig. la). Spur length and pollinator tongue length then coevolve 
by following gradually shifting adaptive peaks (Fig. 1b). Alterna- 
tively, the pollinator shift hypothesis posits that tongue lengths are 
relatively fixed and spurs evolve in a one-sided process to fit them* 
(Fig. 1c). The tongue length of a pollinator may have evolved before 
an association with a plant species owing to selection on body size or 
in response to the spur lengths of other plant species”. When a plant 
becomes newly associated with a pollinator owing to dispersal to a 
new environment or changes in pollinator abundance, spurs then 
evolve to fit the pollinator’s tongue length (Fig. 1c). Shifts to polli- 
nators with markedly shorter tongues are predicted to be less 


Figure 1| Two contrasting hypotheses for the 
evolution of exceptionally long nectar spurs. 
Darwin’s coevolutionary race model (a, b), which 
posits a gradual increase in both the pollinator’s 
tongue and the plant’s nectar spur, and the 


pollinator shift model (c, d), where spur length 
evolves owing to a switch to a new pollinator with 
a longer tongue. These models differ in whether 
adaptive peaks are constantly increasing (b), or 
whether they are relatively fixed optima based on 
pollinators’ pre-existing tongue lengths (d). They 
also differ in whether spur-length evolution 
occurs gradually (b) or in a punctuated fashion 


(d). 
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likely because pollinators avoid flowers when they cannot obtain a 
reward'®''. Thus, this model also predicts that nectar spurs will 
generally become longer through time as spurs evolve to match a 
series of relatively stable adaptive peaks defined by tongue lengths 
(Fig. 1d). 

A major difference between the two hypotheses is whether change 
occurs gradually within a species’ lineage during a coevolutionary 
race with the same pollinator, or rapidly during shifts to new polli- 
nators (Fig. 1). As shifts to new pollinators generally result in repro- 
ductive isolation’’, the change in spur length would be concentrated 
at speciation. Therefore, comparative phylogenetic analyses of the 
pattern of spur-length evolution can test both the common predic- 
tion that spurs generally become longer through time and also 
whether one of the hypotheses better explains the overall pattern of 
spur-length evolution. 

The columbine genus Aquilegia (Ranunculaceae) is the result of a 
recent and rapid radiation’’, thought to be due to a key evolutionary 
innovation, namely nectar spurs’*. To determine how nectar spur 
length evolved in columbines, we used a comparative phylogenetic 
analysis of all 25 North American taxa, which have nectar spurs that 
vary in length over a 16-fold range (7.5-123 mm). To provide the 
phylogenetic framework for comparative tests, we used a genomic 
survey with amplified fragment length polymorphisms (AFLPs) for 
all taxa (Supplementary Table 1). Bayesian analysis’” of 1,576 variable 
markers for 176 individuals results in a highly resolved and well 
supported phylogeny for the North American Aquilegia clade (Sup- 
plementary Fig. la). Eighty per cent of the 30 interspecific nodes 
are resolved with greater than 95% posterior probabilities. The 
phylogenetic results are robust to several alternative methods of 
phylogenetic reconstruction (Supplementary Fig. 1b-d). To map 
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Figure 2 | Quantification of pollination syndromes and the distribution of 
spur lengths in Aquilegia. a, Principal components analysis of 10 floral traits 
clusters species according to pollination syndrome, as defined in ref. 16. 
Species with published records of pollinator visitation are indicated with 
coloured symbols (blue, bumble-bee; red, hummingbird; yellow, hawkmoth; 
orange, hummingbird and hawkmoth; see Supplementary Methods). The 
first principal components axis (PC1) and the second (PC2) are plotted. 
Error bars, +1s.e.m. b, The distribution of spur lengths among the North 
American Aquilegia ranked by size. Taxa in each syndrome are colour-coded 
as in a. Error bars, +1 s.e.m. 
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changes in pollination syndrome onto the phylogeny, we quantified 
multi-character pollination syndromes using principal components 
analysis (PCA) of ten floral traits, including nectar spur length. We 
found that the first two axes separate species into three distinct pol- 
lination syndromes (bumble-bee, hummingbird and hawkmoth) as 
described in ref. 16, and are consistent with direct pollinator obser- 
vations for 11 of the taxa’ (Fig. 2a; Supplementary Methods). These 
three syndromes have nearly non-overlapping distributions of spur 
lengths (Fig. 2b). 

We used the phylogeny to determine the history of pollination 
syndrome evolution. Pollination syndromes are not all monophyletic 
(Shimodaira-Hasegawa test; P< 0.00001), indicating that multiple 
shifts have occurred. To identify the number of pollinator transi- 
tions, we mapped the three discrete pollination syndromes identified 
by PCA using local maximum-likelihood ancestral state reconstruc- 
tions (Fig. 3a). This analysis indicates at least seven independent shifts 
between unrelated pollinators: two transitions from bumble-bee to 
hummingbird pollination and five shifts from hummingbird to 
hawkmoth pollination (Fig. 3a). To test for directionality in pollinator 
shifts, we developed a model describing the minimum number of 
transitions in pollination syndrome across the phylogeny'*. Because 
transitions could be reversible, there are six possible transitions 
among the three syndromes. However, the maximum-likelihood 
solution resulted in only two significant transitions: bumble-bee to 
hummingbird, and hummingbird to hawkmoth (Fig. 3b; Supplemen- 
tary Table 2). Therefore, there has been significant directionality in 
pollinator shifts and a lack of reversals in columbines. 

Underlying the directionality of pollinator shifts are evolutionary 
transitions in spur length. To determine if spur length changes more 
when pollination syndromes shift, we used independent contrasts'’® 
and found that 73% of the total spur-length evolution occurs 
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Figure 3 | Phylogenetic analysis of pollination syndrome evolution in 
Aquilegia. a, The majority-rule consensus bayesian cladogram with tips of 
the tree representing reciprocally monophyletic populations or species (see 
Supplementary Table 1). All interspecific nodes were supported by posterior 
probabilities >0.95 except where indicated. Species were assigned to 
pollination syndrome on the basis of Fig. 2a. The probability of each 
pollination syndrome occurring at ancestral nodes is indicated with pie 
charts at the nodes. Asterisks on the phylogeny indicate inferred shifts 
between pollination syndromes. b, Representative flowers of each 
pollination syndrome, with arrows indicating the only two significant 
transitions identified by the minimal model of evolution. 
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coincidently with pollinator shifts. This is significantly more than 
when pollinators do not change (Mann-Whitney, z = 3.4, one-tailed 
P= 0.00016). Though it is impossible to determine if spurs have 
lengthened or shortened when pollination syndromes remain con- 
stant, when syndromes shift they do so in ordered transitions (Fig. 3b) 
and thus we could infer the directionality of spur-length change. 
Twelve of the thirteen informative contrasts in pollination syndrome 
resulted in increases in spur length (sign test, one-tailed P= 
0.00171). The only decrease in spur length occurred for the smallest 
pollination syndrome contrast (Fig. 4). Furthermore, larger pollina- 
tion syndrome contrasts are correlated with greater spur-length 
evolution (regression analysis, = 0.669, F= 56.58, d.f. = 28, P= 
1.71 X10 *; Fig. 4). These finding are robust to the alternative 
pollination syndrome assignments for Aquilegia micrantha and 
Aquilegia barnebyi (Supplementary Methods) and several alternative 
branch length transformations (Supplementary Table 3). 

The pollinator shift model also predicts that changes in pollination 
syndrome, and thus spur length, will occur ina ‘punctuated’ fashion’, 
primarily during speciation events where at each node, one daughter 
lineage rapidly evolves to occupy a new adaptive peak and the other 
retains the ancestral condition (Fig. 1d). We tested for the existence 
of one or more stable evolutionary states in spur length by comparing 
models that incorporate ‘adaptive peaks’” to a null model of brow- 
nian motion. We found significant preference for a model with three 
distinct optima (likelihood ratio 33.03, d.f.=6, P=1.8 x 10 °; 
Table 1), indicating that once a lineage has adapted to one pollination 
syndrome, spur length remains relatively stable until there is a trans- 
ition to another pollination syndrome. We also explicitly tested 
whether spur length evolves gradually, in proportion to branch 
length, or in a punctuated fashion at speciation® (Fig. 1d), and found 
that the punctuated model was preferred (Supplementary Discus- 
sion; Supplementary Table 4). 

Spur-length evolution within a pollination syndrome may often be 
due to Darwin’s coevolutionary race hypothesis, though the pollina- 
tor shift model may also explain some of this variation. For example, 
pollinator shifts between distantly related hawkmoths may account 
for the substantial variation in spur length among hawkmoth- 
pollinated Aquilegia species (Fig. 2b). Hawkmoth pollinators with 
relatively short tongues, for example, Hyles lineata and Eumorpha 
achemon (38-46 mm)’, belong to the Macroglossinae, whereas those 
with long tongues, for example, Sphinx vashti and Manduca spp. (54— 
137mm)”, belong to the Sphinginae. These subfamilies have been 
separated for at least 13.3 Myr (ref. 22), far longer than the age of the 
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Figure 4 | Independent-contrasts regression analysis of pollination 
syndrome and spur-length evolution. Pollination syndromes were ordered 
so that contrasts reflected the minimum model’s structured transitions 
(Fig. 3b). Transitions in pollination syndrome are significantly correlated 
with increasing spur lengths. 
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Table 1| Alternative models of spur-length evolution 


Number of optima for adaptive models 


Evaluation method BM model One Three 
LoL 47.93 47.94 14.90* 
AIC 51.93 55.94 26.907 
SIC 54.73 61.54 35.317 


Spur-length evolution was modelled using a non-adaptive, brownian motion (BM) model, as 
well as adaptive models constrained around one or three adaptive optima°. Alternative models 
were evaluated by log likelihood (LnL), Akaike information criterion (AIC) and Schwarz 
information criterion (SIC). As statistical criteria for model selection, AIC uses the LnL and the 
number of parameters whereas the SIC additionally incorporates the number of observations. In 
both methods, lower values are preferred. 

*P=18x10 °. 

+ Strongly preferred?°. 


entire North American Aquilegia clade’’. Thus, tongue-length differ- 
ences among hawkmoths were probably established before their asso- 
ciation with Aquilegia, perhaps owing to coevolutionary races with 
other plant species. Therefore, transitions within pollination syn- 
dromes may also be consistent with the one-sided pollinator shift 
model. 

Our finding that spur length in Aquilegia evolves largely through 
directional transitions among pollination syndromes suggests that 
reaching some adaptive optima may require intermediate ‘stepping 
stones’. Thus, the lack of an intermediate adaptive peak may prevent 
a species from shifting to an even more extreme morphology. For 
example, hummingbirds are absent from Eurasia, but hawkmoths do 
exist there”’. Significantly, there is very little variation in spur length 
among Eurasian Aquilegia species (4.0-21.5mm; mean = s.e.m., 
13.0 + 0.59 mm) and no clear examples of the hawkmoth pollination 
syndrome. The length of the longest spurred Eurasian Aquilegia spe- 
cies (Aquilegia alpina, 21.5 mm) is more than 10 mm shorter than the 
shortest North American hawkmoth-pollinated species (Aquilegia 
pubescens, 32.4mm). In the absence of hummingbirds or another 
appropriate pollinator with a comparable tongue length, Eurasian 
Aquilegia species may lack an adaptive ‘stepping stone’ necessary to 
reach the hawkmoth adaptive optimum. 

Although Darwin’s coevolutionary race may be responsible for 
spur-length evolution within species, our comparative phylogenetic 
evidence indicates that the majority of spur-length evolution in 
columbines fits the pollinator shift model. Because columbines have 
experienced a recent and rapid adaptive radiation”, it is likely that 
pollinator tongue lengths were predominantly established before 
spur-length evolution, which has thus evolved primarily during 
repeated and directional shifts among pollination syndromes. Of 
particular note is the finding that shifts in pollination syndrome have 
occurred without reversals, resulting in the progressive lengthening 
of nectar spurs. Our results also indicate that large changes in spur 
length occur disproportionately at speciation events, resulting in 
‘punctuated’ morphological changes. 


METHODS SUMMARY 


We sampled 155 individuals of all 25 North American Aquilegia taxa, and 21 
individuals of 8 outgroup taxa, and genotyped them for AFLP markers. 
We conducted a bayesian phylogenetic analysis of the binary data under a 
restriction-site model in MRBAYES’ and compared the resulting phylogeny 
with parsimony and distance-based phylogenetic analyses (Supplementary 
Methods). The majority rule consensus tree from the bayesian analysis was 
pruned to populations or species for subsequent comparative analyses. 

For 10 floral traits, we measured 10-52 individuals representing 1 to 5 
populations per species for 23 of the 25 taxa in an among-species principal 
components analysis (see Supplementary Table 5 for axis loadings). Trait 
data for the remaining two species were estimated from the literature 
(Supplementary Methods). We conducted local maximum-likelihood ancestral 
state reconstructions, and tested for directionality of pollination syndrome 
evolution using Bayes Multistate'*. Adaptive models of spur-length evolution 
were compared using the likelihood ratio test, the Akaike information criterion 
(AIC) and the Schwarz information criterion (SIC) in OUCH!”’, and the 
punctuated-speciational versus gradual models were compared using AIC values 
in CoMET®. 
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Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 
Taxonomic sampling. One to three populations (2-5 individuals per popu- 
lation) of all 25 North American Aquilegia taxa were sampled (Supplemen- 
tary Table 1). Outgroups consisted of seven Eurasian Aquilegia species and 
Semiaquilegia adoxoides. 
AFLP methodology. DNA was extracted from fresh leaf material of wild col- 
lected plants or plants grown from wild collected seeds. The AFLP protocol 
followed a modified version of that in ref. 24, with a 5X digestion with EcoRI 
and Msel and two fluorescently labelled E-primers in each selective amplifica- 
tion. Products from 20 selective amplifications were separated on a LiCor 4200 
DNA sequencer and variable bands were manually scored with SAGA MX 
(LiCor). 
Phylogenetic analyses. A bayesian analysis of the binary data incorpora- 
ting differential rates of band gains and losses was conducted with the 
RESTRICTION model in MRBAYES v3.0 beta 4’°. As only variable AFLP bands 
were scored, CODING was set to VARIABLE. Three runs of five million genera- 
tions were sampled every 100 generations then combined after removing the 
burn-in (1.5 million generations per run). Posterior probabilities and branch 
lengths were calculated from the consensus of the posterior distribution of trees. 
To determine the robustness of the phylogenetic estimate, we conducted a series 
of additional phylogenetic analyses under distance and parsimony optimality 
criteria (see Supplementary Methods). For comparative analyses, the bayesian 
phylogeny was pruned to populations or species using the average branch-length 
to the clade’s descendants and outgroups were removed. The rooted phylogeny 
was then converted to an ultrametric tree using the NPRS algorithm in r8s (ref. 
25) with a smoothing factor of —0.00001. The single unresolved node in the 
bayesian consensus phylogeny (PP <0.5) was grafted in both alternative 
arrangements with a minimum branch length (1 X 10°) for comparative ana- 
lyses that require a strictly bifurcating phylogeny”®. 
Quantification of floral traits. Ten floral traits were quantified for all species 
(see Supplementary Methods). PCA among species was then used to validate 
previously assigned pollination syndromes'*”’ and to help assign syndromes to 
the few remaining species (see Supplementary Methods). Trait loadings on the 
first two principal components axes are presented in Supplementary Table 5. 
Aquilegia micrantha and A. barnebyi were considered polymorphic for hum- 
mingbird and hawkmoth pollination syndrome or, when polymorphic coding 
was not an option, separate comparative analyses were conducted with these 
species coded as alternative character states. 
Spur lengths. Spur lengths were log-transformed (Kolmogorov—Smirnov test 
for normality, P> 0.15) for comparative analyses. 
Pollination syndrome evolution. Local maximum-likelihood ancestral state 
reconstructions for the three discrete pollination syndromes were conducted 
in Bayes Multistate’*. This program was then used to estimate a minimum model 
of pollination syndrome evolution (Supplementary Table 2). We first tested for 
irreversible transitions by restricting the smallest rates to equal zero. Likelihood 
ratios >2 were used to determine if rates were significantly different from zero. 
To determine if any of the remaining transition rates were significantly different 
from one another, we compared a two-rate model with a one-rate model. We 
used a likelihood ratio test to determine statistical significance following a 7” 
distribution with one degree of freedom based on the difference in the number of 
free parameters between the two models. 
Calculation of independent contrasts. We tested for correlated evolution of 
pollination syndrome and spur length using independent contrasts’? as imple- 
mented in the PDAP module” of Mesquite”’. Pollination syndromes were coded 
as discrete states as assigned (Fig. 2a) and ordered on the basis of the minimum 
model of pollination syndrome evolution (Fig. 3b). Combining continuous and 
discrete characters does not violate the statistical assumptions of independent 
contrasts”. Independent contrasts analysis can be sensitive to branch-length 
assignments, so we used a branch-length diagnostic” and several alternative 
branch-length transformations (Supplementary Table 3). All contrasts, includ- 
ing non-informative pollinator contrasts, were included in the regression 
analysis providing a conservative estimate of correlated evolution between pol- 
lination syndrome and nectar spur length”’. 
Tempo and mode of spur-length evolution. We tested a series of alternative 
evolutionary models to determine the tempo and mode of spur-length evolu- 
tion. First, we examined the fit of several adaptive models in OUCH!” to com- 
pare the maximum likelihood of the brownian motion model with successively 
more complex adaptive models that incorporate an Orenstein-Uhlenbeck elasti- 
city parameter modelling adaptive optima. We compared adaptive models with 
one versus three optima as described in ref. 20. The fit of the data to the models 
was estimated using a likelihood ratio test, the Akaike information criterion and 
the Schwarz information criterion. 

We examined the importance of topology and alternative branch-length scal- 
ing using CoMET® implemented in Mesquite**. We compared results from a 


nature 


gradual model of evolution (termed pure-phylogenetic distance®) to a punctu- 
ated, speciational model (termed punctuated average equal*). Alternative mod- 
els were evaluated using the Akaike information criterion. The maximum 
likelihood solution assigns which descendent lineage evolves and which remains 
constant. 
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West Nile virus emergence and large-scale declines of 
North American bird populations 


Shannon L. LaDeau’, A. Marm Kilpatrick” & Peter P. Marra’ 


Emerging infectious diseases present a formidable challenge to the 
conservation of native species in the twenty-first century’. Diseases 
caused by introduced pathogens have had large impacts on species 
abundances’, including the American chestnut*, Hawaiian bird 
species* and many amphibians’. Changes in host population sizes 
can lead to marked shifts in community composition and ecosys- 
tem functioning***®. However, identifying the impacts of an intro- 
duced disease and distinguishing it from other forces that influence 
population dynamics (for example, climate’) is challenging and 
requires abundance data that extend before and after the intro- 
duction”’. Here we use 26 yr of Breeding Bird Survey (BBS)* data 
to determine the impact of West Nile virus (WNV) on 20 potential 
avian hosts across North America. We demonstrate significant 
changes in population trajectories for seven species from four fam- 
ilies that concur with a priori predictions and the spatio-temporal 
intensity of pathogen transmission. The American crow population 
declined by up to 45% since WNV arrival, and only two of the seven 
species with documented impact recovered to pre-WNV levels 
by 2005. Our findings demonstrate the potential impacts of an 
invasive species on a diverse faunal assemblage across broad geo- 
graphical scales, and underscore the complexity of subsequent 
community response. 

Seven years after the emergence of WNV in New York City in 1999, 
the population-level impacts of this disease on wild birds remain 


largely unknown*"’. Tens of thousands of dead individuals from 
wild, zoo and pet populations have tested positive for WNV across 
North America’, and challenge experiments have demonstrated 
interspecific variability in mortality rates under laboratory condi- 
tions'’. Early field studies documented mortality in some species'*"* 
and evidence of spatially heterogeneous fluctuations”'*'®, but overall 
population patterns were inconclusive. Our study tests the hypo- 
thesis that WNV has caused significant population declines in a broad 
taxonomic range of avian hosts across North America. We explicitly 
considered variability in host susceptibility, spatio-temporal hetero- 
geneity in pathogen transmission, and impacts on populations. 

To test this hypothesis, we developed a set of independent predic- 
tions of WNV impact for 20 species of birds from 11 families on the 
basis of published laboratory infection experiments, mosquito feeding 
studies and seroprevalence surveys (Table 1 and Supplementary Table 
1). Target species span a range of expected impacts, from crows with 
high mortality to gray catbirds and mourning doves, which seem to 
tolerate infection without significant morbidity'®'’. Additionally, we 
chose five species (Baltimore oriole, chipping sparrow, eastern bluebird, 
eastern towhee and white-breasted nuthatch) that have not been the 
focus of previous work to assess potential disease impacts on a broader 
community. We then used a bayesian hierarchical regression fit to 26 yr 
of survey data to test these species-specific predictions across the large 
geographical scale represented by WNV emergence in North America. 


Table 1| Predicted and observed impact of WNV, climate influence and 10- and 26-yr minimum abundances 


Species Predicted impact Observed impact Change in DIC with climate Minimum abundance 
9 
ve 10-yr 26-yr 

American crow (Corvus brachyrhynchos) High Yes 4.6 2004* 2004* 
Blue jay (Cyanocitta cristata) High Yes =14.5} 2004* 2004* 
Fish crow (Corvus ossifragus) High No 0.0 2005* 2005* 
Tufted titmouse (Baeolophus bicolor) High Yes =) 2004* 1980 
American robin (Turdus migratorius) Moderate Yes 5.7 2005* 1981 
House wren (Troglodytes aedon) Moderate Yes 0. 2003* 2003* 
Chickadee? (Poecile spp.) Moderate Yes =23.85 996 1985 
Common grackle (Quiscalus quiscula) Moderate No 0 2003* 2003* 

orthern cardinal (Cardinalis cardinalis) Moderate No 2. 997 1980 
Song sparrow (Melospiza melodia) Moderate fo) =5A 2004* 2004* 
Downy woodpecker (Picoides pubescens) Low No =10 2003* 1984 
Gray catbird (Dumetella carolinensis) Low ° =1.2} 997 1986 

ourning dove (Zenaida macroura) Low ° =1.81 997 1980 

orthern mockingbird (Mimus polyglottos) Low No 0.4 997 1997 
Wood thrush (Hylocichla mustelina) Low fe) 0. 2003* 2003* 
Eastern bluebird (Sialia sialis) Unknown Yes 0.4 2003* 1980 
Baltimore oriole (Icterus galbula) Unknown No 44 2004* 2004* 
Chipping sparrow (Spizella passerina) Unknown fo) 43 997 1980 
Eastern towhee (Pipilo erythrophthalmus) Unknown No =1.2t 2004* 1991 
White-breasted nuthatch (Sitta carolinensis) Unknown No 0.2 996 1984 
See Supplementary Table 1 for full details of predicted impact data. Impact of WNV is measured as abundance below 95% Cls after WNV arrival. Data are for 20 North American bird species. 


*The indicated years follow peak human WNV epidemics in the United States (2002-03). 


+ Black-capped (Poecile atricapilla) and Carolina chickadee (Poecile carolinensis) numbers were combined, as species-specific data are unreliable in areas of range overlap. 
+ Negative values for deviance information criterion (DIC) indicate improved model fit with inclusion of climate for these species. 


‘Smithsonian Migratory Bird Center, National Zoological Park, Washington DC 20008, USA. *Consortium for Conservation Medicine, New York, New York 10001, USA. 
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Thirteen of the twenty species studied reached 10-yr population 
lows after the large-scale human WNV epidemics that occurred in 
much of the United States in 2002-03 (ref. 11) (P = 0.002, assuming 
probability of 10-yr low after 2002 = 0.30), and eight recorded their 
lowest abundance over the 26 yr studied (P = 0.001, assuming prob- 
ability of 26-yr low after 2002 = 0.12) (Table 1 and Fig. 1). However, 
to determine whether WNV was involved in these declines, changes 
in abundance must be evaluated in the context of long-term trends, 
climate and habitat availability. We included climate variability (El 
Nino/Southern Oscillation) in final population models for eight spe- 
cies for which model fit was significantly improved (Table 1 and 
Supplementary Information). We did not include a land-use com- 
ponent in the population model and thus, are unable to rule out a 
potentially confounding role of changes in land cover during this 
study (but see below). 

Observed abundances after WNV emergence were significantly 
lower than expected given two decades of population variability for 
seven species across multiple geographical regions (Figs 2 and 3). Six 
of these species were independently predicted to suffer high or mod- 
erate impacts, and the seventh was previously unstudied (Table 1). 
These seven species included two members of the family Corvidae 
(American crow and blue jay), two from Turdidae (American robin 
and eastern bluebird), two from Paridae (chickadees and tufted 
titmouse) and one from Troglodytidae (house wren). Population 
deviations (average difference between modelled and observed abun- 
dances) were highly correlated with categorical predicted impacts for 
the 15 species with prior information (Supplementary Information; 
r= —0.67, n= 15, P=0.007). All seven of these species are peri- 
domestic, with known suburban association®*!*. Thus, the declines 
observed for these species are opposite from expectations given con- 
tinued suburbanization after 1999", but are consistent with impacts 
owing to WNV. 


American crow American robin Baltimore oriole 
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Observed impacts included steep and sometimes progressive multi- 
year declines in regional populations of American crows (Fig. 2), 
American robins, chickadees and eastern bluebirds, which were all 
increasing before WNV arrival (Fig. 1). Other species, including blue 
jays, tufted titmice and house wrens showed strong 1- or 2-yr declines 
after intense WNV epidemics, but little or no impacts at other times. 
Regionally, we found significant deviations from expected abun- 
dances for all seven species in the eastern United States (Figs 2 and 
3). In other areas where WNV has been present for fewer years, the 
intensity of impact varied among species (Supplementary Fig. 1). 
Common grackle populations in Maryland declined significantly after 
WNV emergence in that state, although in other regions this species 
remained at expected abundances (Supplementary Fig. 2). 

The intensity of declines after pathogen emergence was most 
marked in American crows (Fig. 2). By 2005, crow abundances had 
declined regionally by up to 45% from 1998 levels, although they had 
increased steadily for two decades. American crow declines were posi- 
tively correlated with the intensity of human WNV epidemics within 
each region (r= —0.56, n= 21, P=0.0003), despite variability in 
human behaviour and feeding of mosquitoes on humans compared 
with birds''’*°. Similar correlations between human infections and 
impacts on other avian species were strongest for house wrens and 
eastern bluebirds (P< 0.002), marginally significant for tufted tit- 
mice, American robins and chickadees (0.05 < P< 0.10), and non- 
significant for blue jays (P = 0.34; Supplementary Table 2) and the 13 
species without detectable WNV impact (P-values > 0.1). 

Similarly, the intensity of WNV impacts on these six affected spe- 
cies was not always consistent across species or regions (Fig. 3). 
American robin, eastern bluebird and tufted titmouse populations 
remained below expected abundance across their entire ranges in 
2005. Deviations in chickadee populations were significantly reduced 
in the east but not at their western range limits. Neither house 
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Figure 2 | Declining American crow 
populations. Observed abundances 
(circles) versus mean posterior 
estimates (solid line, with 95% CIs) 
by region across North America are 
shown. Values on the y axis are the 
average number of birds observed 
per BBS route adjusted for observer 
differences and missing data. 
Shaded histograms show numbers 
~_9@ —? of reported annual human 

7 infections'’ per region (maximum 
cases per year in northeast, 370; 
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wren nor blue jay populations showed significant declines in 
Virginia, whereas abundances were up to 22% and 26% below that 
expected in other regions before recovering in 2005 (Supplementary 
Information). 
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Figure 3 | Population declines and WNV epidemics in the northeastern 
United States and Maryland. Observed abundances (circles; birds observed 
per BBS route) versus mean posterior estimated population abundances 
(solid line, with 95% CIs) for six impacted species are shown. The vertical 
dotted lines denote the initial detection of WNV in birds, mosquitoes or 
humans". The complete version of this figure presents observed and 
expected abundances for each species in each geographical region where it is 
present (Supplementary Fig. 1). 
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Assessing the impacts of an invasive pathogen on host popula- 
tions across a continent requires difficult assumptions regarding 
exposure rates, and analyses that are correlational in nature. We 
approached these challenges by using two decades of local population 
surveys and climate data to predict species abundance distributions 
in all years after WNV was first identified. We further strengthened 
our conclusions by comparing our results to the species-specific 
impacts predicted from a collection of previous studies (Table 1 
and Supplementary Table 1) and to the spatial and temporal pattern 
in human epidemics (Fig. 2 and Supplementary Table 2). We 
detected significant declines for six species predicted to have high 
or moderate WNV impacts, and did not detect declines in the five 
species with predicted low impact. Additionally, we identified sud- 
den and significant declines in eastern bluebird populations, high- 
lighting the possibility that species that have not been studied with 
respect to WNV may also be affected by this disease. The fact that we 
did not detect declines in the eight other species, which appeared to 
persist at the same abundance or even show increased abundance in 
the presence of WNV (Baltimore oriole, chipping sparrow, eastern 
towhee, northern cardinal, white-breasted nuthatch), suggests that 
the impacts of WNV were relatively low or that detection of popu- 
lation declines may have been masked by regional variability in 
population fluctuations or long term declines (common grackle, 
fish crow, song sparrow). 

After significantly low abundances, both blue jays and house wrens 
returned to expected population levels in 2005. The resiliency of 
species and the lasting impact of WNV will ultimately depend on 
the species-specific interactions between susceptibility, exposure and 
intrinsic population growth rates. The rank of observed impacts in 
corvids is consistent with susceptibility to experimental WNV infec- 
tion studies, which suggest that American crows suffer the greatest 
mortality (100%), followed by blue jays (75%) and then fish crows 
(53%)"’. Such interspecific differences in pathogen effects on popu- 
lations have been observed in other disease systems”! and can result 
in important changes in community composition. 

The spatial heterogeneity in disease impact apparent for some spe- 
cies may reflect underlying regional differences in the intensity of viral 
transmission. Several key factors in WNV transmission are known to 
vary across the continental United States, including the dominant 
enzootic vectors”, the relationships between vector abundance and 
land use*’, and differences in the composition of host communities 
that can, in turn, influence mosquito feeding preferences”. The role of 
these and other factors in determining WNV transmission and expo- 
sure among hosts is an important topic for future research. 

Changes in population abundance such as those documented in 
this study may themselves alter WNV transmission dynamics”. 
Mortality is likely to facilitate WNV amplification because the 
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infectiousness of hosts (magnitude of viraemia and length of viral 
shedding) is greater in individuals that die relative to those that 
survive’, and hosts that die from infections are not present as 
immune or dead-end hosts. Mortality also increases the vector to 
host ratio, which increases the reproductive ratio of the pathogen, 
Ro. Decreases in host abundance may have other impacts on WNV 
transmission. For example, decreases in the abundance of American 
robins, which appear to be an important WNV amplification host in 
several regions of the USA’””*, have been linked to higher incidences 
of mosquitoes feeding on humans and intensified human WNV 
epidemics”. 

The impacts of invasive pathogens compound existing stressors 
and create formidable challenges for protecting native wildlife’. 
West Nile virus will continue to affect avian communities in the 
foreseeable future, and substantial ecosystem effects may become 
evident with time. Finally, we believe that the findings presented here 
are probably conservative estimates of population-level impacts 
because the hardest hit avian sub-populations may reside outside 
the BBS survey areas, which are limited to secondary roads and gen- 
erally exclude urban centres where the predominant Culex mosquito 
vectors in the eastern United States are most common. Nonetheless, 
the population changes that we have documented have already led to 
marked changes in the composition of avian communities across 
North America. 


METHODS SUMMARY 


We selected 20 common North American bird species that were regularly present 
along survey routes in the northeastern United States where WNV first emerged. 
We further chose species with available background information regarding sus- 
ceptibility to WNV infection (Supplementary Information), and selected the 
species pool to cover the range of expected mortality. Finally, we randomly 
selected five species from the group that satisfied our general survey criteria 
but had not been previously studied with regard to WNV. This produced a total 
of 20 species, which was chosen as the number that we could efficiently model 
and evaluate. 

We used 26 yr of North American BBS* data (1980-2005) for each of the 
species selected. We included data from an average of 38 routes (range 15-88) 
and 1,900 distinct census points per region (selected to represent WNV dispersal 
westward from its east coast introduction) so that the patterns we detected would 
represent regional population changes, rather than local-scale stochasticity”!””®. 
We used a hierarchical bayesian regression model” fit to data collected before the 
emergence of WNV to estimate probability distributions for expected abundance 
in all subsequent years to 2005. Posterior distributions for expected abundance 
explicitly incorporated trends before WNV emergence and regional climate 
variability (Supplementary Information), as well as stochasticity associated with 
location and observation error. We considered a species to have been signifi- 
cantly affected when observed abundances fell outside 95% credible intervals 
(95% CIs) from the posterior abundance distributions. We validated our results 
by comparing the species-specific deviations from expected population abun- 
dances with predicted WNV impacts based on a collection of previous exposure 
and mortality studies. Finally, we evaluated the agreement between the timing of 
our estimated WNV impacts with known WNV presence in a region (first 
documented and human epidemics). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 
Data selection. We used data from 228 North American Breeding Bird Survey 
(BBS)* routes across ten states (Massachusetts, Connecticut, New Jersey, New 
York, Pennsylvania, Maryland, Virginia, Illinois, Colorado, Oregon) that rep- 
resent WNV dispersal westward from the 1999 east coast introduction. We 
included routes that had at least 80% coverage from 1980 to 2005, and a max- 
imum of two missing observations in years after WNV emergence. We selected 
20 widespread and common North American bird species according to six 
criteria: (1) native to North America; (2) distribution includes New England/ 
Mid-Atlantic states where WNV has been present the longest; (3) were regularly 
detected on BBS routes and in sufficient numbers along secondary roads, given 
habitat expectation and data; (4) breeding and vocalization season overlap with 
timing of BBS collection; (5) species for which some background information on 
susceptibility to WNV infection was available (Supplementary Table 1); and (6) 
chose species pool to cover the range of expected WNV impact. Finally, we 
randomly selected five additional species from the group that satisfied criteria 
1-4 but that had not been studied in WNV infection experiments. This produced 
a total of 20 species, which was chosen as the number that we could efficiently 
model and evaluate. 
Model. Our modelling goal was to propagate stochasticity and trend associated 
with two decades of avian population data before WNV arrival to construct 
probability distributions for expected abundance in years after WNV emergence. 

We fit an overdispersed, Poisson regression model for counts at each route- 
by-year node where stochastic relationships among routes and observers were 
normally distributed random variables with mean zero and unknown variance”. 
For a given species, individual counts, Gj Were conditionally Poisson 

Ge ~ Pois (Aj) j= 1... m;t=t,...,T (1) 

where subscripts j and t refer to route and year, respectively. 

The expected value 2; for a given annual count was 


log (Aj) = By (t-t*) + ye + Bi + Qi + Ee (2) 


where f};, is the linear trend centred at mean year t* over all routes in region k, ® 
and Q are random effects for variation among routes and observers respectively, 
and é; are normally distributed error terms with mean zero. We evaluated each 
species’ model individually for inclusion of region-specific inter-annual climate 
covariates « (El Nifio/Southern Oscillation (ENSO) index and North Atlantic 
Oscillation (NAO) evaluated in Supplementary Information) for reduced devi- 
ance information criterion®’ (DIC) values over a model with no inter-annual 
climate covariate. Model predictive ability was improved by including an annual 
ENSO effect for blue jay, chickadee, downy woodpecker, eastern towhee, gray 
catbird, mourning dove, song sparrow and tufted titmouse populations. 
Delineation of geographical regions (k) was also identified by DIC model com- 
parison and fell generally along separation by state (except in the northeast where 
Pennsylvania, New York, New Jersey, Massachusetts and Connecticut were 
aggregated). We used standard vague priors on all unknown parameters. 
Hyperprior distributions for precision parameters were given inverse gamma 
distributions with mean 1 and variance 1,000. Models were fit by Gibbs sampler 
using the WinBUGS program”. All models were run for 50,000 iterations fol- 
lowing a 10,000 to 15,000 iteration ‘burn-in’. Convergence was assessed through 
visual inspection and Gelman—Rubin diagnostics”’on multiple Markov chains. 
Of the total 5,928 route-by-year nodes, 576 were missing in the raw data. These 
missing data were scattered throughout the 228 routes but occurred dispropor- 
tionately in the early part of the time series (most often between 1980 and 1987). 
We fit the model to BBS data and used the Gibbs sampler to adjust for differences 
among observers (skill and effort levels) and to estimate probability distributions 
for abundance in years and routes with missing observations. We fit the model to 
the 26-yr data set to create a replicate and complete time series for each route. We 
used a cross-validation procedure* that randomly removed 30% of the data and 
evaluated the model’s ability to estimate these missing nodes. Correlation co- 
efficients comparing observed and estimated nodes (given model presented) 
ranged between 0.68 and 0.98 among fits for individual species. 
Predicted abundance estimates. We fit the model described above to data 
collected before WNV emergence only, and then estimated expected abundances 
at each route location in all subsequent years (that is, from 1999 for east coast, 
2001 for Illinois, 2002 for Colorado and 2004 for Oregon''). We directly com- 
pared the posterior probability distributions of expected abundances in post- 
WNYV years with the true observations. We assigned significance to years where 
observed abundances fell outside 95% posterior credible intervals (95% CIs). If 
WNYV emergence did not cause detectable mortality rates then we would expect 
no significant deviation between observed abundance and posterior predicted 
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The medaka draft genome and insights into 
vertebrate genome evolution 


Masahiro Kasahara’*, Kiyoshi Naruse**, Shin Sasaki'*, Yoichiro Nakatani’*, Wei Qu’, Budrul Ahsan’, 
Tomoyuki Yamada‘, Yukinobu Nagayasu’, Koichiro Doi’, Yasuhiro Kasai’, Tomoko Jindo*, Daisuke Kobayashi’, 
Atsuko Shimada’, Atsushi Toyoda’, Yoko Kuroki®, Asao Fujiyama>”, Takashi Sasaki’, Atsushi Shimizu”, 

Shuichi Asakawa’, Nobuyoshi Shimizu’, Shin-ichi Hashimoto®, Jun Yang®, Yongjun Lee®, Kouji Matsushima®, 
Sumio Sugano’, Mitsuru Sakaizumi®, Takanori Narita”®, Kazuko Ohishi’, Shinobu Haga’, Fumiko Ohta’, 

Hisayo Nomoto”, Keiko Nogata’, Tomomi Morishita’, Tomoko Endo’, Tadasu Shin-l?, Hiroyuki Takeda’, 


Shinichi Morishita’ & Yuji Kohara® 


Teleosts comprise more than half of all vertebrate species and have 
adapted to a variety of marine and freshwater habitats’. Their gen- 
ome evolution and diversification are important subjects for the 
understanding of vertebrate evolution. Although draft genome 
sequences of two pufferfishes have been published”, analysis of 
more fish genomes is desirable. Here we report a high-quality draft 
genome sequence of a small egg-laying freshwater teleost, medaka 
(Oryzias latipes). Medaka is native to East Asia and an excellent 
model system for a wide range of biology, including ecotoxicology, 
carcinogenesis, sex determination**® and developmental genetics’. 
In the assembled medaka genome (700 megabases), which is less 
than half of the zebrafish genome, we predicted 20,141 genes, 
including ~2,900 new genes, using 5’-end serial analysis of gene 
expression tag information. We found single nucleotide poly- 
morphisms (SNPs) at an average rate of 3.42% between the two 
inbred strains derived from two regional populations; this is the 
highest SNP rate seen in any vertebrate species. Analyses based on 
the dense SNP information show a strict genetic separation of 4 mil- 
lion years (Myr) between the two populations, and suggest that 
differential selective pressures acted on specific gene categories. 
Four-way comparisons with the human, pufferfish (Tetraodon), 
zebrafish and medaka genomes revealed that eight major interchro- 
mosomal rearrangements took place in a remarkably short period 
of ~50 Myr after the whole-genome duplication event in the teleost 
ancestor and afterwards, intriguingly, the medaka genome pre- 
served its ancestral karyotype for more than 300 Myr. 

We applied the whole-genome shotgun approach to an inbred 
strain, Hd-rR (ref. 8), derived from the southern Japanese population, 
as the main target. A total of 13.8 million reads amounting to approxi- 
mately 10.6-fold genome coverage were obtained from the shotgun 
plasmid, fosmid and bacterial artificial chromosome (BAC) libraries. 
A newly developed RAMEN assembler was used to process the shot- 
gun reads to generate contigs and scaffolds. The N50 values (50% of 
nucleotides in an assembly are in scaffolds—or contigs—longer than 
or equal to the N50 value) are ~ 1.41 megabases (Mb) for scaffolds and 
~9.8 kilobases (Kb) for contigs. The total length of the contigs reached 
700.4 Mb, which, from now on, we refer to as the medaka genome size. 


To construct ultracontigs, the scaffolds were integrated with the 
medaka genetic map by using SNP markers. For this purpose, we 
further obtained about 2.8-fold coverage of shotgun reads from 
another inbred strain HNI (refs 9, 10), which is derived from the 
northern Japanese population. The reads were assembled by RAMEN 
to scaffolds covering 648 Mb. Aligning the HNI contigs with the Hd- 
rR genome using BLASTZ”, we identified 16.4 million SNPs as well 
as 1.40 million insertions and 1.45 million deletions in non-repetitive 
regions (Supplementary Table 2). We selected 2,401 SNPs and gen- 
etically mapped them onto medaka chromosomes using a backcross 
panel between the two strains. Where possible, at least one SNP 
marker was selected in each Hd-rR scaffold of greater than 60 Kb. 
As a result, the N50 ultracontig size became ~5.1 Mb (excluding 
gaps), and 89.7% of the assembled nucleotides were anchored to 
the chromosomes. Aligning the Hd-rR assembly with reference 
BACs totalling 2.3 Mb showed that the overall nucleotide level accu- 
racy was 99.96% when 100 base pairs of contig ends were excluded 
(Supplementary Table 4). Details of gnome assembly and analysis of 
basic features such as CpG islands and repeat elements are described 
in Supplementary Information. 

We first focus on polymorphisms between the two inbred strains, 
Hd-rR and HNI. The genome-wide SNP rate is 3.42%, which is, to 
our knowledge, the highest SNP rate seen in any vertebrate species. 
The SNP rate is not constant among chromosomes (Kruskal-Wallis 
test, P< 107°) (Fig. 1a), like the divergence between chimpanzee and 
human’. This variation can not be accounted for by the difference in 
gene density alone, because the SNP rates in exons and introns are 
correlated in most regions across chromosomes (Fig. la). The sub- 
stitution rate in CpG dinucleotides and the frequency of repetitive 
sequences might cause the variation because these factors loosely 
correlate with local nucleotide divergence rates (R’ = 0.378 and 
0.455, respectively, as illustrated in Supplementary Fig. 5b and 5c). 

In human-chimpanzee analysis, the sex chromosomes exhibit the 
highest (Y) and lowest (X) mutation rates'*. Medaka also has an XX— 
XY sex-determining system’, but the differentiation of its sex chromo- 
somes seems primitive; chromosome 1 (Chr 1; 33.7 Mb) serves as the 
X chromosome, whereas a duplicate Chr 1 with a 250-kb Y-specific 


'Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa 277-0882, Japan. *Department of Biological Sciences, Graduate 
School of Science, The University of Tokyo, Tokyo 113-0033, Japan. >RIKEN Genomic Sciences Center, Yokohama 230-0045, Japan. “National Institute of Informatics, Tokyo 101- 
8430, Japan. “Department of Molecular Biology, Keio University School of Medicine, Tokyo 160-8582, Japan. ‘Department of Molecular Preventive Medicine, School of Medicine, The 
University of Tokyo, Tokyo 113-0033, Japan. ‘Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo 108-8639, Japan. 
’Department of Environmental Science, Faculty of Science, Niigata University, Niigata 950-2181, Japan. °Center for Genetic Resource Information, National Institute of Genetics, 


Mishima 411-8540, Japan. 
*These authors contributed equally to this work. 


714 


©2007 Nature Publishing Group 


NATURE|Vol 447|7 June 2007 


region that contains the male-determining gene, DMY (also known 
as dmrtib)°, serves as the Y chromosome. The Y-specific region is 
thought to have jumped to Chr 1 about 10 Myr ago, before the sepa- 
ration of the medaka lineage’’. Thus, although we sequenced male 
genomes, it is difficult to distinguish whether the sequence reads of 
Chr 1 are derived from the X- or Y-chromosome. Indeed, the overall 
divergence rate in the medaka sex chromosome (Chr 1) does not differ 
much from autosomes (Fig. la). In the Hd-rR draft genome, the 
assembly of the Y-specific region is incomplete presumably owing to 
repetitive elements; instead, we detected polymorphisms in the reads 
from the region spanning at least 3.5 Mb around the Y-specific region, 
whereas the other region is highly homozygous. These polymorphisms 
in the inbred strain demonstrate the local suppression of crossing-over 
near the male-determining region, which has long been known in 
medaka genetic recombination tests'*. It seems that the male-deter- 
mining region causes restriction of recombination but its effect is 
limited to 10% of the length of Chr 1. The medaka Y chromosome 
might be at an early stage of sex differentiation. 

The two medaka inbred strains Hd-rR and HNI are estimated to 
have diverged about 4 Myr ago’. Recent study using the mitochon- 
drial cytochrome b gene has elucidated the detailed phylogenetic 
relations among four genetically different wild populations 
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Figure 1| Genetic variation between two medaka strains. a, Sequence 
diversity in 200 Kb segments of chromosomes. Box edges, the quartiles; 
vertical bars, the range; 1*, sex chromosome. b, Phylogenetic analysis of the 
SNPs identified between Hd-rR and HNI during regional diversification. 
From the parental populations of HNI and Hd-rR, six individuals named 
Niigata 1/2/3 and d-rR 1/2/3 were analysed to find 5 and 1 additional SNPs, 
respectively. * indicates an inbred strain. 
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(Northern Japanese, Southern Japanese, East Korean and Chinese— 
West Korean)’ (Fig. 1b). Despite the indubitable accumulation of 
genetic variation, they all can mate and produce healthy and fertile 
offspring. The massive SNP resources and living stock of medaka 
regional strains enabled us to perform a genome-wide SNP analysis 
for the history of wild populations during regional diversification. 
We analysed genetic variation in 47 PCR-amplified regions (approxi- 
mately 24.9 Kb in total) as shown in Fig. 1b. Nago and Kaga strains 
were chosen because they are most distantly related to Hd-rR and 
HNI within each population, respectively’. We focused on 475 SNP 
sites identified between Hd-rR and HNI. The comparative analysis 
first revealed that 130 (27%) and 28 (5.8%) of 475 SNP sites were 
polymorphic in the southern and northern populations, respectively, 
and one mutation happened to be shared by both populations 
(Fig. 1b). The remaining 318 (475—(130+28—1)) Hd-rR/HNI 
SNPs were thus preserved or fixed between the southern and north- 
ern populations (common SNPs in Fig. 1b). In comparison with the 
consensus sequence of the outgroup, Korea—Taiwan—China medaka 
(a regional medaka species with genetic variations compared with the 
Japanese medaka) strains, 120 and 185 mutation events were assigned 
to the southern and northern lineages, respectively, whereas 13 events 
were unclear. Overall, the total number of mutations introduced over 
4Myr is 250 (120+130) for Hd-rR and 213 (185+28) for HNI, 
indicating that almost the same level of mutations accumulated in 
the history of the two strains (chi-squared test, y* = 3.11, d.f. = 1, 
P<0.05). On the other hand, the lower levels of polymorphism in 
the northern lineage support their rapid and recent expansion from a 
small size population (that is, a bottleneck effect), which has been 
suggested in a study with mitochondrial DNA’. More importantly, 
the high ratio of common SNPs (>65%) as well as few shared poly- 
morphic sites indicates a strict genetic separation between the two 
populations for 4 Myr without major species differentiation (or spe- 
ciation). Further analysis will shed light on the genetic variations and 
speciation in vertebrates. 

To generate the medaka gene catalogue, we obtained over one 
million 5'-end serial analysis of gene expression (5'SAGE) tags’®, 
which correspond to the transcription start sites. The tags were 
grouped and, on the basis of transcription start site information, we 
predicted 20,141 non-redundant gene structures by a newly developed 
algorithm that uses Genscan. These predicted genes are really tran- 
scribed, however, the predicted exons may not be entirely accurate 
as an unavoidable consequence of the ab initio method. Thus, we 
compared the predicted gene structures with 85 known full-length 
complementary DNA sequences. They matched by BLASTP at 83/85 
(97.6%) (expected value (E-value) < 10 1°) or at 72/85 (84.7%) (E- 
value < 10 ~°). Furthermore, 407 (58.6%) out of 694 exons in the 85 
cDNA sequences were perfectly predicted by our algorithm in which 
Genscan was used with the default setting for vertebrates. Full details 
are in Supplementary Information. 

To characterize the 20,141 predicted medaka genes, we used 
TBLASTX” to compare them with the genes of six other verte- 
brates—human, Tetraodon nigroviridis, zebrafish, Takifugu rubripes, 
chicken and mouse—in the RefSeq database, and also with the gene 
clusters of aves, amphibia, ray-finned fish and ascidiacea in the 
UniGene database. We found that 3,727 have no homologues even 
with loose criterion (E-value < 10 *) ina TBLASTX search (Fig. 2a). 
Then, we examined whether these novel gene candidates have any 
unique protein domains according to a PROSITE scan (http://ca. 
expasy.org/prosite/). With a stringent search criteria, by which 
unique domains of more than 20 amino acids were detected for 
35.1% of non-novel predicted genes, only 30/3,727 (0.8%) of the 
new gene candidates were recognized as having known domains, 
suggesting that most of them are structurally unique. Interestingly, 
64.4% of these candidates have CpG islands on their upstream 
regions and the ratio is higher than the average (50.5%) of all the 
predicted genes. 
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Among the 3,727 new gene candidates, 2,078 had no similarity 
even to medaka expressed sequence tags. Because 1,443 of 2,078 
had open reading frames shorter than 100 amino acids, many of them 
might be non-coding; therefore we tried to validate the new gene 
candidates by testing the expression of the predicted transcripts by 
PCR with reverse transcription (RT-PCR). As a result, we estimated 
the number of ‘true novel genes’ to be 1,287 out of 2,078 (see 
Supplementary Information for details). Taking into account the 
remaining 1,649 expressed-sequence-tag-supported ones, at least 
2,936 medaka-specific novel genes are estimated in the draft genome. 
This large number of new genes will provide a valuable genetic 
resource for medaka biology. 

Of the 20,141 predicted medaka genes, we further analysed 11,617 
(57.7%) that had human orthologues. Four-thousand three-hundred 
and forty-two (21.6%) genes constituted medaka—human reciproc- 
ally best 1:1 orthologue pairs such that TBLASTX E-values are less 
than 10 “* and the ratios of reciprocally aligned portions shared by 
two orthologues are at least 30%. Of these orthologue pairs, 2,292 
were assigned the gene ontology (GO) ‘biological process’ annota- 
tions of their corresponding human genes (Fig. 2b). Orthologues 
involved in carbohydrate metabolism, alcohol metabolism and 
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catabolism were found to be more conserved than genes implicated 
in the immune response, transcription, reproduction, apoptosis and 
stress response. Furthermore, 925 of the 1,395 human disease genes 
in the Online Mendelian Inheritance in Man (OMIM) database have 
strong orthologues among the medaka genes, such as A2M (alpha-2- 
macroglobulin) and PSENI (presenilin 1), which is implicated in 
Alzheimer’s disease, and TP53 (tumour suppressor protein p53) 
and DLECI (deleted in lung cancer-1; also known as CLEC4C) which 
are both involved in carcinogenesis. 

To gain insight into gene evolution and species differentiation, we 
examined rapidly and slowly evolving gene categories between the 
two medaka inbred strains Hd-rR and HNI. The average K,/Ksg ratio 
of 8,889 qualified medaka predicted genes between the two strains is 
0.413—significantly higher than that for the human—chimpanzee 
lineage (0.23 for K,/Ks)'*, which has experienced major speciation 
for 5 Myr. We used the median K,/Kg ratio of each GO-based func- 
tional category of medaka genes, and plotted it against that of 
human-—chimpanzee (Fig. 2c). We focus here on the specific categor- 
ies referred to by previous analyses among mammalian species, which 
included immunity, host defence, reproduction and olfaction as 
rapidly evolving categories, and intracellular signalling, neurogenesis 
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Figure 2 | Medaka genes. a, Breakdown of medaka gene homologues in 
other species. b, Similarity of medaka—human 1:1 orthologous pairs is 
arranged by GO category. Number of orthologues is in parentheses. Bars 
show median (thick) and mean (dashed) similarity of orthologues. c, Dots 
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represent pairs of medians of K,/Ks ratios in the medaka strains and the 
human-chimpanzee lineage for GO categories with =10 genes. Rapidly and 
slowly evolving GO categories in medaka relative to the hominid lineage are 
coloured yellow and blue, respectively. 
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and neurophysiology as slowly evolving ones’*. The rapidly evolving 
categories are thought to be involved in adaptation to environment 
and sexual separation, both of which are essential processes during 
and/or after speciation. Intriguingly, these rapidly evolving categor- 
ies are not evident in the medaka lineage, whereas mammalian slowly 
evolving neural-related categories exhibit relaxed constraint. Thus 
the reduced rate of evolution in the reproduction- and sex-related 
categories might explain why the two medaka strains can mate and 
produce fertile offspring even after a long period of geographical and 
genetic separation. One example is the ubiquitin specific protease 9Y 
(USP9Y) gene for male gametogenesis on hominid Y chromosome 
(K,/Kg = 1.0) or on medaka LG21 autosome (K,/Kg = 0.27). In gen- 
eral, the extent of phenotypic variation between organisms is not strictly 
related to the degree of sequence variation, and this is also the case for 
species differentiation of medaka. Our comparative analysis thus sug- 
gests that differential selective pressures act on specific categories in a 
lineage-specific manner, and this may contribute to a pattern of evolu- 
tion, for example, adaptation with or without speciation. 
Whole-genome duplication (WGD) and subsequent asymmetric 
changes in duplicated genes are thought to have an important role in 
genome evolution’’. Recently, several studies have examined WGD 
in the teleost lineage*'®** and have reconstructed ancestral karyo- 
types using the available genomic data*’””°. Although these previous 
studies attempted to estimate the number of proto-chromosomes 
before the WGD event, interchromosomal genome rearrangements 
during evolution, and the correspondence between the proto- 
chromosomes and present chromosomes, there have been no clear 
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answers concerning the timing of major interchromosomal rearran- 
gements. Thus, we have conducted large-scale four-way comparisons 
of the medaka, human, zebrafish and Tetraodon genomes (see Sup- 
plementary Information for full details of scenario construction). 

Here we summarize our scenario of genome evolution from the 
ancestral karyotype to the three teleost genomes. Figure 3 illustrates 
one example of how we inferred the ancestral chromosomes, and 
Fig. 4 depicts our scenario. The date we adopt for the WGD and 
lineage divergence is based on molecular clock estimates**”. The 
key events we propose are as follows. 

@ Ina relatively short period of ~50 Myr after the WGD event 
(336-404 Myr ago), the MTZ-ancestor (the last common ancestor of 
medaka, Tetraodon, and zebrafish) had 24 chromosomes and had 
undergone 8 major interchromosomal rearrangements (2 fissions, 
4 fusions and 2 translocations). 

@ In contrast, since zebrafish diverged about 314-332 Myr ago, 
the medaka genome has preserved its ancestral genomic structure 
without undergoing major interchromosomal rearrangements for 
more than 300 Myr. The Tetraodon genome underwent fusion events 
on three occasions after separating from the medaka lineage about 
184-198 Myr ago. 

The zebrafish genome seems to have experienced many interchro- 
mosomal rearrangements during evolution by extensive transloca- 
tions, but the precise scenario remains to be solved because of a 
relatively small number of zebrafish genome markers that we used in 
the present study. Nevertheless, these zebrafish genome markers were 
useful in revealing the eight major interchromosomal rearrangements 
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Figure 3 | Reconstruction of proto-chromosomes. a, Doubly conserved 
synteny (DCS) blocks**° between one human and two medaka 
chromosomes were searched to identify duplicated medaka chromosomes. 
Human-—medaka orthologues are plotted by orange or grey bars in the rows 
of human chromosomes (Hsa), corresponding to counterpart medaka 
chromosomes; in this case, four rows corresponding to Ola15, Ola19, Olal, 
and Ola8. The series of co-occurrences of orthologues (orange plots) in the 
rows of Ola15 and Olal19 in Hsal0 shows a DCS block, and similarly more 
DCS blocks are found in other chromosomes. Medaka chromosomes that 
share a DCS block are thought to share a proto-chromosome. Ola8 and 
Ola15 share no DCS blocks, implying that they were from distinct proto- 
chromosomes. On the basis of this logic, we deduced that parts of Ola15, 
Ola19 and Ola1 are derived from a proto-chromosome (d, orange), whereas 
parts of Ola19, Olal and Ola8 are from another proto-chromosome (e, grey). 


Purple, traces of teleost proto-chromosome c (see Fig. 4 and Supplementary 
Fig. 13). Next, to analyse chromosome fission events, we acknowledge that 
two chromosomes generated by a fission event are unlikely to have common 
paralogues derived from the WGD in teleost. The table (centre, bottom) 
shows the number of paralogues between medaka chromosomes. The fairly 
small number between Olal and Ola19 imply that they were the results of 
chromosome fission. Finally, the correspondence among chromosomes of 
the three fishes was determined using the orthologue information in the 
tables (left, top and bottom). See Supplementary Fig. 13 for information on 
the other proto-chromosomes. b, The dot plots exhibit focal synteny blocks 
between medaka and Tetraodon chromosomes, which presents more 
accurate synteny than does the table of orthologue numbers. Ola, Oryzias 
latipes chromosome; Dre, Danio rerio chromosome; Tni, Tetraodon 
nigroviridis chromosome; and Hsa, Homo sapiens chromosome. 
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Figure 4 | Teleost genome evolution. The figure depicts a model for the 
distribution of ancestral chromosome segments in the human, zebrafish, 
medaka and Tetraodon genomes. Thirteen reconstructed ancestral 
chromosomes are represented by the coloured bars, and the genomic 
regions originating in the ancestral chromosomes (a—m) have the same 
colour coding. Major rearrangements are represented by arrows and 


before the divergence of the three fishes. Because more than half of all 
the teleost species examined have either 24 or 25 chromosomes, it has 
been speculated that the teleost ancestor also had 24 or 25 chromo- 
somes'’. This is consistent with our current findings that the recon- 
structed MTZ-ancestor, which is the common ancestor of most 
teleosts'*®’, had 24 chromosomes. Our scenario is the most likely 
one, but there is an alternative: that Olal0, 13, and 14 are derived from 
one ancestral chromosome instead of two; and one of the duplicated 
ancestral chromosomes became Ola14, whereas the other underwent a 
fission event, yielding Olal0 and Olal3 (12 ancestral chromosome 
model). The alternative scenario seems unlikely because it assumes that 
the fission occurred in the same position in human and fish chromo- 
somes derived from the common ancestor (Supplementary Fig. 15). 
The high-quality medaka draft genome will provide a key resource 
for developmental genetics and serve as an important reference for 
various ray-finned fishes that have yet to be sequenced. Cichlids and 
stickleback, which are newly emerging model systems for under- 
standing the genetic basis of vertebrate speciation, are evolutionarily 
closer to medaka than zebrafish. The same is true for many com- 
mercially important fish species, which include tuna, flounder, sea 
bream and fugu as well’*. Indeed, the medaka and Tetraodon gen- 
omes have long synteny blocks in the genome (Supplementary Fig. 
12). Furthermore, there are many close relatives of medaka that are 
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lineage-specific small-scale translocations by dotted arrows. A dashed box in 
the zebrafish genome indicates that most parts of the corresponding 
ancestral chromosome were lost by extensive rearrangements. Individual 
ancestral chromosomes are labelled a—m, which corresponds to the 
evolutionary scenarios in Supplementary Information. 


indigenous from East to Southeast Asia’, many of which are now 
maintained in laboratories. Together with the medaka draft genome, 
low-coverage shotgun genome sequences of these fish species will 
shed further light on the mechanisms underlying speciation and 
diversity, which have yet to be fully addressed at the genome- 
sequence level. 


METHODS 

Full details of methods are described in Supplementary Information. The 
medaka strain Hd-rR was provided by Y. Ishikawa. Other strains (HNI, d-rR, 
Kaga, Nago, HSOK, Taiwan and Kunming) were from our (H.T. and M.S.) 
laboratory stocks, except for Niigata from H. Mitani. These strains are available 
from the National BioResource Project (http://shigen.lab.nig.ac.jp/medaka/). 
Sperm DNA of Hd-rR was provided by M. Matsuda and used for whole-genome 
shotgun. Genomic DNA was also prepared from male adult bodies. Messenger 
RNA for 5’SAGE analysis was obtained from 0—7-day-old embryos and adult 
body tissue. Whole-genome shotgun assembly was made with the RAMEN 
assembler (to be published elsewhere) and using adaptations to various pro- 
blems, including PCR slippage. Ultracontigs were produced by anchoring 
scaffolds to the chromosomes by using genetic markers, including a large num- 
ber of SNP markers. The descriptions of sequence assembly refer to the latest 
assembly version 1.0, whereas other analyses were based on version 0.9. The two 
assemblies have almost the same contigs and scaffolds, but the former assembly 
has longer ultracontigs because more genetic markers were integrated. The 
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medaka gene catalogue was generated by our transcription-start-site-based gene 
prediction algorithm. The point of this algorithm is to enable us to predict the 
first exon and 5’ UTR, which were difficult to predict solely by a conventional 
gene prediction tool like Genscan. A formula was developed to define CpG 
islands for medaka. Novel repeats that occupied 9.2% of the medaka genome 
were found by our de novo repeat detection algorithm. The synonymous (Ks) and 
the non-synonymous (K,) substitution rates of individual genes were calculated 
using the PAML package. The method for reconstruction of the ancestral karyo- 
type is described briefly in the legend of Fig. 3. The method used orthologous 
chromosome correspondence among the three teleost genomes, paralogous 
chromosome correspondence between the medaka and Tetraodon genomes, 
and doubly conserved synteny blocks between the medaka and human genomes. 
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Dscam2 mediates axonal tiling in the Drosophila 


visual system 


S. Sean Millard’, John J. Flanagan’, Kartik S. Pappu’, Wei Wu’ & S. Lawrence Zipursky' 


Sensory processing centres in both the vertebrate and the inverte- 
brate brain are often organized into reiterated columns, thus facil- 
itating an internal topographic representation of the external 
world. Cells within each column are arranged in a stereotyped 
fashion and form precise patterns of synaptic connections within 
discrete layers. These connections are largely confined to a single 
column, thereby preserving the spatial information from the peri- 
phery. Other neurons integrate this information by connecting to 
multiple columns. Restricting axons to columns is conceptually 
similar to tiling. Axons and dendrites of neighbouring neurons of 
the same class use tiling to form complete, yet non-overlapping, 
receptive fields’. It is thought that, at the molecular level, cell- 
surface proteins mediate tiling through contact-dependent re- 
pulsive interactions’”*°, but proteins serving this function have 
not yet been identified. Here we show that the immunoglobulin 
superfamily member Dscam2 restricts the connections formed by 
L1 lamina neurons to columns in the Drosophila visual system. 
Our data support a model in which Dscam2 homophilic inter- 
actions mediate repulsion between neurites of L1 cells in neigh- 
bouring columns. We propose that Dscam2 is a tiling receptor for 
LI] neurons. 

The Drosophila visual system is a modular structure®’. The retina 
contains 750 simple eyes, each containing eight photoreceptor neu- 
rons or R cells (R1-R8). R cells project into the brain, where they 
make connections within two neuropils, the lamina and medulla. 
R1-R6 neurons target to the lamina, where they form synapses with 
lamina neurons (LI—-L5). R7, R8 and L1-L5 form connections in 
single columns within layers in the medulla, and each column con- 
tains one axon of each of these cell types. As a consequence of this 
wiring pattern, each column processes motion (lamina neurons) and 
colour (R7 and R8) from a single point in space®. Although some 
progress has been made in understanding how neurons select differ- 
ent layers within each of the 750 columns’, the molecular mechan- 
isms that restrict synaptic connections to a single column are not 
known. 

Dscam2 belongs to a conserved family of cell-surface proteins 
expressed in the nervous systems of many different organisms*"°. 
Down syndrome cell adhesion molecule (DSCAM) was originally 
identified as an open reading frame in a region of human chro- 
mosome 21 critical for Down’s syndrome''. There are four Dscam 
genes in the fly genome (Dscam, and Dscam2-4). They encode type I 
transmembrane proteins that share about 30% sequence identity and 
have a common extracellular domain comprising ten immuno- 
globulin and six fibronectin type III repeats (Fig. 1a). These proteins 
have divergent cytoplasmic tails. The genomic organization of each 
fly Dscam family member differs considerably. Dscam encodes four 
cassettes of alternative exons that can potentially generate 38,016 
different proteins through mutually exclusive alternative splicing’. 
Dscam has a function in forming neural circuits throughout the fly 
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Figure 1 | Dscam2 is required for visual system development. a, Drosophila 
Dscam family members. The percentage identity between the extracellular 
domains is shown at the left, and the number of amino acid residues in the 
protein at the right. Dscam isoforms differ within three immunoglobulin 
domains (coloured horseshoes). Dscam2 has two isoforms differing at 
immunoglobulin domain 7 (red horseshoe). Immunoglobulin domains, 
horseshoes; FN domains, black boxes; transmembrane domains, blue bars. 
b, Homologous recombination (HR) scheme to knock out Dscam2 (see 
Methods). kb, kilobases; w+ indicates the white gene which is used as a 
marker to detect recombinants. c, Molecular verification of the targeting 
event by polymerase chain reaction. WT, wild type. d, e, Dscam2 mutants are 
protein-null. The images show wild-type (d) and Dscam2 mutant (e) pupal 
brains stained with a Dscam2 antibody 40h after puparium formation 
(APF). f, g, R7 and R8 projections in the medulla stained with monoclonal 
antibody 24B10 (red) are disorganized in adult Dscam2 mutant brains. The 
projections of other neuronal classes were also disrupted (see 
Supplementary Fig. 1). 
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brain'*’”. Dscam isoforms bind homophilically"’, and in vivo studies 
indicate that these interactions promote repulsion’**'. Dscam2—4 do 
not show extensive isoform diversity, and in this way these family 
members are more similar to mammalian DSCAMs. Dscam2 has 
two alternative immunoglobulin 7 domains that share about 50% 
sequence identity and are referred to as Dscam2A and Dscam2B. 
Given the structural similarities between Dscam and Dscam2 and 
the prominent expression of Dscam2 on neurites in the developing 
brain (see Fig. 1d), we proposed that interactions between Dscam2 
proteins are required for patterning neuronal connections. 

To assess the function of Dscam2, we generated protein-null 
mutations in the gene by homologous recombination” (Fig. 1b-e; 
see Methods). The Dscam2 mutants were viable but had marked 
defects in R-cell projections into the medulla (Fig. 1f, g). Using a 
panel of cell-type specific markers in the medulla (Supplementary 
Fig. 1), we observed widespread defects in axonal and dendritic 
organization. As wiring defects in one class of neurons may indirectly 
affect other classes, it was not possible to accurately assess the func- 
tion of Dscam2 in homozygous mutant animals. 
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Figure 2 | Dscam2 restricts L1 arbors to columns. a, Schematic of lamina 
neuron and R-cell projections in the medulla. Each cell targets to a specific 
layer (m1—m6, left) and is restricted to a single column (right). b, c, R7 and 
R8 do not require Dscam2. The terminals of mutant R7 (b) and mutant R8 
(c) (yellow, bottom) in adult brains are indistinguishable from wild-type R7 
and R8 (yellow, top). All R7 and R8 axons (red) are stained with monoclonal 
antibody 24B10 in this figure. d, MARCM scheme. A lamina-specific 
enhancer was used to drive FLP in lamina precursor cells. Homozygous 
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To identify a specific cell type that requires Dscam2, we removed it 
from subsets of neurons by using genetic mosaic techniques. We 
targeted four cell types (R7, R8, L1 and L2) that connect to specific 
layers within each medulla column (Fig. 2a). To assess whether 
Dscam2 was required in R7 and R8, genetically mosaic animals were 
generated in which mutant R7 and R& cells projected into an other- 
wise wild-type brain. R7 (n= 87) and R8 neurons (n= 336; see 
Methods) lacking Dscam2 formed patterns of projections that were 
indistinguishable from their wild-type counterparts (Fig. 2b, c). 

We extended our analysis to a subset of lamina neurons, L1 and L2. 
L1 axons arborize in two medulla layers, m1 and m5. In contrast, L2 
axons form a single terminal arborization at the m2 layer. To assess 
whether Dscam2 is required in L1 and L2 neurons, we generated 
single mutant cells in an otherwise wild-type background, using 
the MARCM technique”. To do this, we expressed FLP recombinase 
under the control of a Dachshund (Dac) enhancer** (see Methods) to 
induce recombination selectively in lamina precursor cells just before 
their final cell division (Fig. 2d). In wild-type controls, fewer than ten 
lamina neurons were labelled per optic lobe. Of these, 90% were L1 
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mutant cells lacking the Gal80 repressor were labelled with actin-Gal4 and 
UAS-CD8GFP (green). e, f, L2 cells do not require Dscam2. Wild-type 

(e) and mutant (f) L2 terminals (green) were indistinguishable. g-i, L1 cells 
require Dscam2 for columnar restriction. Wild-type L1 axons (g) arborized 
in the m1 and m5 layers of the medulla and were restricted to a single column 
(dashed lines). Mutant L1 cells (h, i) targeted to the correct layers, but their 
arbors were not restricted to a single column. Animals were analysed at 
about 70% APF in e-i. 
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neurons and 10% were L2. Wild-type L1 (Fig. 2g; n= 165) and L2 
(Fig. 2e; n= 28) cells arborized in the correct layers and were 
restricted to a single column. Other lamina neurons were not labelled 
by this procedure (see Methods). 

Dscam2 mutant L1 neurons arborized in the correct layers. These 
arbors, however, were no longer restricted to a single column (67%; 
n= 228) and often extended over several columnar units (Fig. 2h, i). 
These neurons formed terminal structures within the appropriate 
layers in adjacent columns. Phenotypes were observed in ml, in 
m5 or in both of these layers. In some cases (less than 10%) L1 axons 
bifurcated between ml and m5 and each branch targeted to the 
appropriate layer in adjacent columns (see Supplementary Fig. 2). 
In marked contrast to mutant L1 neurons, the terminal arbors of 
mutant L2 neurons were indistinguishable from the wild type 
(Fig. 2e, f; n= 97). In summary, Dscam2 is required within L1 neu- 
rons to restrict arbors to a single column. Conversely, R7, R8 and L2 
axons are restricted to a single column by Dscam2-independent 
mechanisms. 

How might Dscam2 restrict L1 processes to a single column? 
Columnar restriction in the medulla is reminiscent of dendritic til- 
ing’. Here dendrites of neighbouring cells of the same class do not 
overlap. Although the molecular mechanisms underlying tiling are 
not known, it has been proposed that they involve homotypic repul- 
sion between cells of the same type*. If Dscam? restricts L1 processes 
in this manner then we would predict, first, that Dscam2 would 
exhibit homophilic binding; second, that L1 processes expressing 
Dscam2 would contact each other during development and then 
retract to a single column; and third, that wild-type L1 axonal pro- 
cesses would extend into adjacent columns in which L1 neurons were 
Dscam2 mutant. 

To assess whether Dscam2 exhibits homophilic binding, we used 
cell aggregation assays and pull-down experiments as described prev- 
iously for Dscam'*””. Two S2 cell populations expressing different 
Dscam2 isoforms (Dscam2A and Dscam2B) segregated into isoform- 
specific clusters (Fig. 3a, b). Similar results were obtained from 
mixing experiments between Dscam2 and either Dscam or Dscam3 
(data not shown). Confirming this binding specificity, Dscam2 ecto- 
domains fused to human Fc bound only to the full-length Dscam2 
proteins with the identical ectodomain (Fig. 3c, d). In summary, 
Dscam2 interacts with itself in an isoform-specific manner and does 
not bind to other Dscam family members. 

To assess whether L1 processes contact each other during develop- 
ment and whether Dscam2 is expressed in these layers, we examined 
wild-type L1 arborization patterns and Dscam2 antibody staining 
during pupal development. Using MARCM to label L1 cells, we 
observed growth cone expansions and immature interstitial branches 
at 30h after puparium formation (APF) (Fig. 3e). About 10h later, 
m1 and m5 arbors were exuberant, not restricted to columns, and 
neurites from neighbouring labelled cells contacted each other 
(Fig. 3f). During subsequent development these processes retracted 
and were restricted to a single column by 70 h APF (Fig. 3g). Dscam2 
was expressed within these layers throughout this time course. 
Expression peaked at 40h APF and was markedly reduced by 70h 
APF, by which time L1 arbors were restricted to a single column 
(Fig. 3h-j). It is not possible to determine which cells within these 
two layers account for the Dscam2 immunoreactivity; however, the 
results of our genetic studies make it likely that minimally, L1 pro- 
cesses are Dscam2 positive. Dscam2 is also found in other layers, but 
at only low levels or not at all in R7 and R8 growth cones (Fig. 3j). 

If L1 axons are restricted to a single column by Dscam2 homo- 
philic interactions, then wild-type L1 arbors should display a pheno- 
type when they contact mutant axons lacking Dscam2. To address 
this, we used reverse MARCM”*. As with MARCM, both wild-type 
and mutant lamina neurons are generated, but in reverse MARCM 
only the wild-type cells are labelled (Fig. 4a). As the frequency of 
generating labelled cells is low, the likelihood that a labelled wild- 
type L1 axon and a mutant lamina axon will be present in the same or 
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an adjacent column is correspondingly low. In control experiments, 
labelled wild-type cells were restricted to columns in a wild-type 
genetic background (Fig. 4b; n = 444). In contrast, of 466 wild-type 
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Figure 3 | Dscam2 binds homophilically and its expression is correlated 
with L1 arbor retraction during development. a, b, Cell aggregation assay (see 
Methods). a, Control. Cells expressing Dscam2A (marked by co-expression of 
red fluorescent protein) mixed with cells expressing Dscam2A (marked by co- 
expression of green fluorescent protein). b, Cells expressing Dscam2A (red) 
and Dscam2B (green) segregate from one another, showing that homophilic 
interactions are isoform-specific. ¢, d, Pull-down assay. Dscam, Dscam2A, 
Dscam2B and Dscam3 ectodomain—Fc fusion proteins bound their cognate 
Flag-tagged full-length protein in extracts of transfected S2 cells (¢; see 
Methods). d, Dscam2A or Dscam2B ectodomain—Fc fusion proteins bind to 
themselves but not other Dscam proteins. Right, inputs. The flag symbols in 
cand d indicate the Flag epitope. e-g, Wild-type L1 arbor development. At 
30h APF (e), wild-type L1 cells consist of a terminal growth cone and nascent 
m1 arbors. At about 40 h APF (f), L1 arbors in adjacent columns contact each 
other. The third column from the left does not contain a labelled L1 cell and 
this permits the detection of invading neurites from columns 2 and 4. At 
about 70 h APF (g), L1 processes are restricted to a single column. h, i, Dscam2 
protein expression in the medulla during pupal development. Dscam2 
expression peaks during the retraction phase of L1 development (40h APF; 
h), and is then downregulated (70 h APF; i). j, Dscam2 distribution (green) is 
non-uniform at 40 h APF. Left, image also stained with monoclonal antibody 
24B10 (red). Middle, X2.5 magnification of the boxed region. Right, at this 
stage, L1 arbors reside immediately above R7 and immediately below R8 in 
layers with strong Dscam2 staining. 
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LI neurons examined using reverse MARCM, we observed 15 neu- 
rons extending processes into adjacent columns (Fig. 4c—e). Thus, 
Dscam2 homophilic interactions are required for restricting L1 
arbors to columns. 

As both L1 and L2 mutant neurons are generated by Dac-FLP 
induced MARCM (see above), Dscam2 could restrict L1 arbors either 
through repulsive interactions between L1 axons in adjacent columns 
or through adhesive interactions between L1 and L2 axons in the 
same column. Interactions with L2 axons are unlikely for two rea- 
sons: first, although L2 axons extend through the m1 layer, and thus 
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Figure 4 | Dscam2 homophilic interactions are required for axonal tiling. 
a, Reverse MARCM scheme. The Dscam2 mutation is on the Gal80- 
containing chromosome so that the wild-type, but not mutant, cells are 
labelled. b, A wild-type L1 neuron (green) in a wild-type background 
generated by MARCM (control). c—e, Non-autonomous tiling phenotypes in 
wild-type cells using the reverse MARCM technique. Note the unidirectional 
nature of the phenotype. R cells (red) are labelled with monoclonal antibody 
24B10. f, Left, Possible outcomes of reverse MARCM. Non-autonomous 
phenotypes could arise from interactions with a Dscam2 mutant cell in the 
same or an adjacent column. A requirement in the same column would 
generate a bidirectional phenotype, whereas a requirement in an adjacent 
column would generate a unidirectional phenotype. Right, Observed result 
and interpretation. The reverse MARCM phenotype is exclusively 
unidirectional (see also Supplementary Fig. 2), indicating that Dscam2 
homophilic interactions mediate repulsion. We propose this is due to a 
mutant ‘unlabelled’ L1 neuron (red) in the adjacent column. g, Model for 
columnar restriction of L1 arbors. Neurites from L1 cells in adjacent 
columns interact through Dscam2 homophilic contacts. This generates a 
repulsive signal resulting in the retraction of neurites to their column of 
origin. It is important to note that if Dscam2 expression is not restricted to 
LI neurons in the layer, then isoform-specific or co-receptor-specific 
mechanisms may restrict Dscam2 activity to L1 neurons within these layers. 
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could mediate interactions with L1 processes in this layer, they do not 
extend to the m5 layer, and second, the reverse MARCM phenotype is 
exclusively asymmetric, suggesting that the mutant axon resides in 
an adjacent column (Fig. 4f, and Supplementary Fig. 2). In MARCM 
experiments, 61% of the mutant arbors extended in both directions, 
but under reverse MARCM conditions none of the phenotypes were 
bidirectional. These data argue that Dscam2 mediates axonal tiling 
between L1 processes in neighbouring columns (Fig. 4f, g). 

Columnar restriction is a common organizing principle used by 
many sensory systems that relay spatial information from the peri- 
phery to processing centres in the brain. As a result of the reiterative 
nature of these circuits, multiple targets are available in close prox- 
imity to each other within the same layer. Local repulsion between 
axonal processes of identical neurons in adjacent columns, which 
make connections with these targets, provides a developmental strat- 
egy for preserving the spatial information in each circuit. Here we 
show that Dscam2 is a homophilic tiling receptor for L1 neurons. 
Axonal tiling ensures that synaptic connections are made exclusively 
with targets in a single column. 

The functions of Dscam and Dscam2 have intriguing similarities 
and differences. Although both promote homophilic repulsion 
between neurites, they do so in different cellular contexts. As each 
neuron expresses a unique set of Dscam isoforms, neurites from the 
same cell selectively recognize and repel each other'’*'?”*. This 
process, called ‘self avoidance’, facilitates the uniform coverage of 
synaptic fields in the nervous system’*'”*'. By contrast, Dscam2 
mediates repulsive interactions between neurites of the same cell 
type. This process, called tiling, limits connections to a local area. 
Tiling and self avoidance therefore act in concert to pattern dendritic 
and axonal fields in the nervous system. 


METHODS SUMMARY 

MARCM and reverse MARCM experiments. To generate Dscam2 mutant 
lamina neurons, we used a Dac-FLP source on chromosome II and labelled 
the mutant cells with actin-Gal4, UAS-CD8GEP. Only L1 and L2 lamina neurons 
were labelled using this scheme. Using a different Dac-FLP source and other Gal4 
sources, and performing mitotic recombination on a different chromosome 
arm, clones in all lamina neurons can be generated with this system (A. Nern 
and S.L.Z., unpublished observations). Thus, it remains formally possible that 
our MARCM experiments generated some unlabelled mutant lamina neurons. 
For reverse MARCM, two copies of Dac-FLP were used to increase the frequency 
of mitotic recombination (see fly stocks in Methods). Again, L1 and L2 cells 
were preferentially labelled. Wild-type MARCM clones generated with two 
copies of Dac-FLP were used as controls for the reverse MARCM experiments. 
Control and experimental samples were coded, mixed together, and scored 
blindly to avoid any bias. All other experimental procedures are described in 
Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Fly stocks. The following stocks were used for ends-out homologous recom- 
bination (see below for a description of the method): ‘HR stock’, w; P[70-ISce- 
1],4P[70-FLP], ScO/CyO and ‘Tester stock’, w; 70-FLP (constitutive); TM2/ 
TMé6b. The Dscam2 mutant alleles generated by homologous recombination 
were designated as Dscam2"""», Dscam2”"? and Dscam2 "> and were main- 
tained over TM6b. Markers for C3 and T1 neurons were 568-Gal4 and 10-50- 
Gal4, respectively. R7 MARCM was performed largely as previously described”. 
The stocks used were GMRFLP; Dscam2"", FRT79/ CyO:TM6band Pank7-Gal4, 
UAS N-synaptobrevin GEP/CyO; Gal80, FRT79/TM6b. The stocks used for R8 
mosaics were ey3.5FLP; RpS17, arm-lacZ, FRT80B/TM6b and w; Rh6-lacZ/CyO; 
Dscam2"""?, FRT80B/TM6b. For lamina neuron-specific MARCM the stocks 
were w; Dac-FLP/CyO Kr-GFP; Dscam2""", FRT79/TM6b, and w; actin-Gal4, 
UAS-CD8GEP; Gal80, FRT79/TM6b (gift from A. Nose). The stocks used for 
reverse MARCM were Dac-FLP; Dac-FLP; FRT79/CyO:TM6b and w; actin-Gal4, 
UAS CD8GEP, Gal80, Dscam2"""', FRT79! TM6b. 

Homologous recombination. Ends-out homologous recombination was per- 
formed essentially as described”. In brief, an ‘ends-out’ targeting construct, 
pW37 Dscam2, was generated that contained the white gene, immediately 
flanked upstream and downstream by insulator sequences from pPelican vector, 
followed by 3.5-kb and 3.1-kb homologous arms lying upstream and down- 
stream from exon! of Dscam2, respectively. Four independent donor lines har- 
bouring the targeting transgene were crossed to the HR stock described above. 
Progeny from this cross were heat-shocked for 1 h at 38 °C at 0-48 h of develop- 
ment. About 600 mosaic females from each donor line were crossed to the tester 
stock (above) and non-mosaic progeny were then backcrossed to the tester stock. 
Stocks were established from flies that lacked eye colour mosaicism and were 
analysed by PCR. Three of the 40 lines established from ‘red-eyed’ flies contained 
targeted insertions. 

Molecular verification of Dscam2 targeting. DNA was extracted from homo- 
zygous viable candidate lines, and genomic PCR was performed. Primers anneal- 
ing outside the Dscam2 locus were used in combination with primers annealing 
within the deleted region in the same reaction (Fig. 1c). The lack of band from the 
deleted exon] region indicated the presence of the targeted allele. 
Construction of a lamina neuron-specific FLP source. A 325-bp subfragment 
of 3EE*”” (ref. 24) was used to build a lamina-specific FLP transgene. A Nofl- 
BamHI fragment containing the entire FLP coding region and a simian virus 40 
(SV40)-poly(A) tail from the UAS-FLP vector (gift from J. Duffy) was cloned 
into the Notl-BamHI-digested pCasper-4. The 325-bp enhancer fragment was 
then added as an EcoRI fragment upstream of the FLP-SV40-PolyA sequence. 
Finally, an hsp70 minimal promoter was inserted as a KpnI—Noil adaptor frag- 
ment between the lamina enhancer and the FLP coding region to generate the 
dac-lamina-FLP vector. This vector was injected, in accordance with standard 
protocols, to generate several independent transgenic fly lines. 

Histology. Immunohistochemistry was performed as described*’. The rabbit 
polyclonal antibody raised against the Dscam2 cytoplasmic domain was used 
at a 1:2,000 dilution for immunohistochemistry. 

Cell aggregation. To generate plasmids containing both a Dscam cDNA and a 
fluorescent marker, pIZGM was created by removing the OpIE2 promoter from 
pIZT (Invitrogen) and replacing it with the metallothionine inducible promoter, 
MtnA, from pRMHA3. pIZRM was created by replacing GFP in pIZGM with a 
PCR product containing RFP. Dscam and Dscam3 were excised from 
pBluescript, filled in to create blunt ends, and cloned into pIZGM or pIZRM. 
Dscam2 was excised from pOTB7, filled in to create blunt ends, and cloned into 
pIZGM or pIZRM. Aggregation assays were performed as described”. 
Pull-down assays. Dscam2A, Dscam2B, and Dscam3 full-length were modified 
with two tandemly arrayed Flag or haemagglutinin tags. These were introduced 
into each construct by cloning annealed oligonucleotides containing the epitope 
in frame with the cytoplasmic domains of each Dscam family member. 
Dscam2A-Fc and Dscam2B—Fc were generated as described previously for 
Dscam-Fc proteins'*. These fusion proteins comprised the N-terminal nine 
immunoglobulin domains and a single FNIII repeat from Dscam2 followed by 
human FcH. Pull-down assays were performed as described"’. 
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Deficiencies in DNA damage repair limit the function 
of haematopoietic stem cells with age 


Derrick J. Rossi'*, David Bryder'*t, Jun Seita', Andre Nussenzweig’, Jan Hoeijmakers® & Irving L. Weissman’ 


A diminished capacity to maintain tissue homeostasis is a central 
physiological characteristic of ageing. As stem cells regulate tissue 
homeostasis, depletion of stem cell reserves and/or diminished 
stem cell function have been postulated to contribute to ageing’. 
It has further been suggested that accumulated DNA damage could 
be a principal mechanism underlying age-dependent stem cell 
decline”. We have tested these hypotheses by examining haemato- 
poietic stem cell reserves and function with age in mice deficient 
in several genomic maintenance pathways including nucleotide 
excision repair®*, telomere maintenance*® and non-homologous 
end-joining”*. Here we show that although deficiencies in these 
pathways did not deplete stem cell reserves with age, stem cell 
functional capacity was severely affected under conditions of 
stress, leading to loss of reconstitution and proliferative potential, 
diminished self-renewal, increased apoptosis and, ultimately, 
functional exhaustion. Moreover, we provide evidence that endo- 
genous DNA damage accumulates with age in wild-type stem cells. 
These data are consistent with DNA damage accrual being a 
physiological mechanism of stem cell ageing that may contribute 
to the diminished capacity of aged tissues to return to homeostasis 
after exposure to acute stress or injury. 

In the murine haematopoietic system, long-term multilineage dif- 
ferentiation and self-renewal are mediated by long-term reconstit- 
uting haematopoietic stem cells (LT-HSCs), which can be isolated 
from the bone marrow of young and old mice by their unique cell 
surface phenotype (lineage c-Kit* Sca-1* flk2~ CD34 )*"° (Supple- 
mentary Fig. 1). To evaluate the effect of deficiencies in nucleotide 
excision repair (NER), non-homologous end-joining (NHEJ) and 
telomere maintenance on stem cell reserves during ageing, we quan- 
tified the frequency and absolute numbers of LT-HSCs in the bone 
marrow of young and old XPD"? (refs 3, 4), Ku80 “~ (refs 7, 8), and 
late-generation mTR ‘~ (refs 5, 6) mice and controls. These analyses 
revealed that, regardless of age, neither stem cell frequency (Fig. 1a—d) 
nor absolute numbers (Supplementary Fig. 2) were appreciably 
reduced in any of the mutants examined. Indeed, rather than being 
diminished, the frequency of LT-HSCs in the bone marrow of the 
mutants increased significantly with age (Fig. le-g), which is con- 
sistent with the expansion of LT-HSC reserves in BL6 strains of mice 
ageing naturally (Fig. 1h)’"''. Moreover, the degree to which the stem 
cell pool expanded in each of the mutant strains was closely corre- 
lated with age-matched controls (Supplementary Fig. 3). We next 
evaluated the impact of genomic maintenance and ageing on down- 
stream multipotent progenitor (MPP) and oligopotent progenitor 
populations (Supplementary Fig. 1)'*. These analyses revealed that 
whereas short-term (ST)-HSC reserves were not significantly affected 
in any of the mutants assayed (Supplementary Fig. 4), downstream 


Mpp™?*, common myeloid progenitor (CMP) and common 
lymphoid progenitor (CLP) progenitor populations were frequently 
diminished in the mutants, although this was not strictly correlated 
with age (Fig. li-k). Taken together, these results indicate that defi- 
ciencies in NER, telomere maintenance or NHEJ do not significantly 
affect the establishment, maintenance or expansion of LT-HSC 
reserves with age. This suggests that LT-HSCs may be cytoprotected 
against the accumulation of different types of DNA lesion with age- 
ing, perhaps as a consequence of their largely quiescent state’. In 
contrast, downstream progenitors, which cycle more rapidly”, were 
more adversely affected in these mutants, indicating that these popu- 
lations might be more susceptible to DNA damage responses such as 
growth arrest or apoptosis, which are characteristically activated in 
cycling cells at the G1/S and G2/M checkpoints’. 

Although we and others have assayed HSC activity in young telo- 
merase-deficient mice'®'’, the consequence of advancing age and 
accumulated damage resulting from telomere attrition on LT-HSC 
function has not been evaluated. We therefore purified LT-HSCs 
from old (60-week) late-generation (G3) mTR ‘~ mutants and con- 
trols, and competitively transplanted 50 stem cells against 2 x 10° 
competitor bone marrow cells with the use of the CD45 congenic 
system’. We reasoned that this strategy would maximize the genomic 
damage associated with critically short telomeres in LT-HSCs as 
increased genomic instability®, and signal-free telomere ends'® have 
been shown to accompany the ageing of haematopoietic cells in late- 
generation mTR ‘— mice (Supplementary Fig. 5). Analysis of trans- 
plant recipients revealed that short-term reconstitution was modestly 
reduced in the G; mTR ’ LT-HSC-transplanted recipients, yet by 
20 weeks after transplantation it had dropped off precipitously 
(Fig. 2a), with B-cell, T-cell and myeloid lineages all being signifi- 
cantly affected (Fig. 2b). Granulocyte chimaerism was monitored 
throughout the course of the experiment as a measure of ongoing 
stem cell function because granulocytes are short-lived and require 
continued stem cell activity to be generated'’. These analyses indi- 
cated a progressive loss of stem cell function that approached exhaus- 
tion by 20 weeks after transplantation (Fig. 2c). Consistent with this, 
stem cells from primary G; mTR ’~ transplanted recipients were 
incapable of serially transplanting secondary recipients, indicating 
that they had become functionally exhausted (Fig. 2d). Transplanta- 
tion experiments performed in parallel with LT-HSCs from younger 
(36-week) G; mTR ‘~ mice revealed that although stem cells from 
these donors were compromised in comparison with controls (Fig. 
2e), the magnitude of the functional decline was not as marked as 
when stem cells from older G; mTR ‘~ mice were assayed (Fig. 2c). 

To test whether diminished self-renewal might underlie the func- 
tional exhaustion of aged G; mTR ’ LT-HSCs, we assayed the 
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capacity of these cells to self-renew in primary transplant recipients’. 
These experiments showed that LT-HSCs from aged G; mTR /~ 
donors were 16-fold less capable than controls of giving rise to phe- 
nocopies of themselves (Fig. 2f). We next assayed the intrinsic pro- 
liferative capacity of telomerase-deficient stem cells by quantifying 
the total progeny-cell output of cultured KLSflk2™ cells (LT-HSCs 
and ST-HSCs combined) from young (19-week), and old (51-week) 
G; mutants and controls. This showed that there was a significant 
decline in the proliferative capacity of the mTR ‘~ cells, which was 
exacerbated with advanced age (Fig. 2g) and underwritten by an 
increased apoptotic response (Fig. 2h). Cumulatively, these results 
indicate that telomere attrition limits stem cell function in an age- 
dependent manner by intrinsically diminishing self-renewal and pro- 
liferative capacity, and rendering LT-HSCs susceptible to apoptosis 
under conditions of stress. 

To assay the effect of NER on LT-HSC functional capacity, we 
competitively transplanted 50 LT-HSCs from 26-week-old XPD‘'? 
and XPD*’* mice. Stem cells from the XPD""” mice showed a sig- 
nificant diminution in multilineage reconstitution potential (Fig. 3a, 
b) and a progressive loss of stem activity that approached exhaustion 
by 16 weeks after transplantation (Fig. 3c). The observation that 
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XPD"? LT-HSCs were incapable of stably reconstituting secondary 
hosts during serial transplantation confirmed that stem cell activity 
had become exhausted (Fig. 3d). Transplantation experiments per- 
formed in parallel with stem cells from younger (12-week) XPD""” 
and control mice revealed that although stem cells from younger 
XPD" mice were functionally compromised in comparison with 
controls (Fig. 3e), they performed significantly better than stem cells 
from older XPD""” mice (Fig. 3c). 

To determine whether diminished self-renewal capacity contrib- 
uted to the functional decline of XPD" stem cells, we assayed the 
capacity of LT-HSCs to self-renew in primary transplant recipients, 
which showed that XPD""” LT-HSCs had a 5.2-fold reduced capacity 
for self-renewal than controls (Fig. 3f). We next tested the intrinsic 
proliferative capacity of KLSflk2™ cells from young (16-week) and 
old (73-week) XPD"7” mutants and controls and found that whereas 
the cells from young mutants were marginally affected, stem cells 
from old XPD'"” mice showed significantly reduced proliferative 
capacity (Fig. 3g), which was associated with increased apoptosis 
(Fig. 3h). Taken together, these results identify a significant role 
for xeroderma pigmentosum complementation group D (XPD)- 
mediated NER in maintaining the functional capacity of LT-HSCs 
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with age by preserving reconstitution ability, self-renewal potential 
and proliferative capacity, and by preventing programmed cell death 
under conditions of stress. 

We next assayed the importance of NHEJ on stem cell function by 
competitive transplantation of LT-HSCs from Ku80-deficient mice 
and controls. As expected, Ku80 ’~ LT-HSCs were unable to gen- 
erate mature B and T cells as a result of an inability to undergo V(D)J 
recombination’. Ku80-deficient stem cells were also sharply impaired 
in their ability to reconstitute myeloid lineages, indicating severely 
diminished stem cell activity (Fig. 4a, b). Consistent with this was our 
observation that Ku80‘~ LT-HSCs were 26-fold less capable of giv- 
ing rise to phenocopies of themselves than controls in primary trans- 
plant recipients, indicative of an attenuated self-renewal capacity 
(Fig. 4c). Moreover, cultured KLSflk2~ cells from Ku80-/~ mutants 
had a reduced capacity to proliferate, which was greatly exacerbated 
with age (Fig. 4d) and was associated with increased apoptosis 
(Fig. 4e). Cumulatively, these results identify a role for Ku80 and 
NHEJ in maintaining LT-HSC function by conserving reconstitution 
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potential, self-renewal capacity, proliferative capacity and stem cell 
viability under conditions of stress. 

Our data showing an age-dependent diminution of stem cell func- 
tion in three different genomic maintenance-deficient settings sug- 
gested that accumulated genomic damage might be an important 
physiological mechanism contributing to stem cell decline with age. 
To test whether DNA damage accumulation accompanied normal 
stem cell ageing, we immunostained LT-HSCs from young (10-week) 
and old (122-week) mice for phosphorylation of histone H2AX (‘y- 
H2AX) as an indicator of DNA damage”’, and quantified the number 
of y-H2AX foci in individual stem cells. This analysis revealed that 
whereas LT-HSCs from young mice were largely devoid of y-H2AX 
foci, the vast majority (82%) of the stem cells from old mice stained 
positively for y-H2AX, with more than 70% of the cells showing mul- 
tiple foci (Fig. 4g, h). Similarly, ST-HSCs and MPP"** isolated from 
old mice contained significantly more y-H2AX foci than their young 
counterparts (Fig. 4i, j), although the percentage of y-H2AX-positive 
old cells decreased as the cells progressed from LT-HSCs through the 
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more committed progenitors (65% in old ST-HSCs; 25% in old 
MPP"**), By the CLP'“?* and CMP stages of differentiation, signifi- 
cant differences in y-H2AX staining between young and old mice were 
no longer observed (data not shown). Taken together, these data 
indicate that DNA damage accumulates in stem cells with age, and 
suggests that proliferating progenitor cells are either repaired more 
readily or are eliminated on accumulating damage. 

Although studies with purified HSCs have provided great detail 
about how ageing alters the functional capacity of HSCs”"''*!, much 
less is known about the mechanisms driving these changes. Because 
HSCs are long-lived, age-dependent functional decline could be pos- 
tulated to result from the accumulation of macromolecular damage in 
general”, or DNA damage in particular’. In support of this, stem cells 
from mice with mutations in Brca2 (ref. 23) or Msh2 (ref. 24) have 
reduced repopulating abilities, whereas Ercc1-deficient mice have mul- 
tilineage cytopenias indicating possible stem or progenitor cell dysfunc- 
tion’’. Evidence that DNA damage response has a significant bearing on 
the function of HSCs during ageing was provided in studies dem- 
onstrating that reactive oxygen species limit the functional capacity 
of HSCs from ataxia-telangiectasia mutated (ATM)-deficient mice” 
in a p38-MAPK-dependent manner”, and in studies on mice bearing 
a mutated Rad50 allele, which undergo haematopoietic failure in an 
ATM-Chk2-dependent fashion**”. The present demonstration that 
genetic deficiencies in telomere maintenance, NER and NHEJ intrins- 
ically diminish LT-HSC function in an age-dependent manner under 
conditions of stress indicates that DNA damage accrual may underlie 
the reduced capacity of stem cells to mediate a return to homeostasis 
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after exposure to injury or stress. Our findings also have implications 
for stem cell involvement in oncogenesis because they establish that 
relatively quiescent stem cells can persist in the face of age-dependent 
DNA damage accrual, and in such a way might serve as a reservoir for 
the multiple mutagenic events underlying oncogenic transformation. 


METHODS SUMMARY 

Purification and transplantation of cells. LT-HSCs (lineage c-Kit*Scal* 
flk2~ CD34 ) were purified and transplanted as described’. In brief, bone mar- 
row cells were enriched for c-Kit, stained with fluorescence-conjugated antibod- 
ies against Scal, c-Kit, CD34, flk2 and lineage (CD3, CD4, CD8, Mac-1, B220, 
Gr-1 and Terl19) and purified by fluorescence-activated cell sorting (FACS). 
Fifty test cells (CD45.2) were transplanted against 2 X 10° bone marrow com- 
petitor cells (CD45.1) into lethally irradiated recipients (CD45.1). Peripheral 
blood was analysed with simultaneous detection of CD45.1, CD45.2, T-cell 
antigen receptor (TCR)-f, B220, Macl and Ter119. 

Proliferation and annexin V analysis. Equivalent cell numbers were sorted and 
cultured for 3.5-4.5 days in RPMI medium containing 10% fetal calf serum, 
10 ng ml! stem cell factor, thrombopoietin, interleukin (IL)-3, IL-6, IL-11 and 
Fit3 ligand at 37 °C, 2.5% O, and 5% COs. Cells were then stained with annexin 
V and propidium iodide, and analysed by FACS. 

y-H2AX immunostaining. y-H2AX was revealed by using the SCIPhos 
(single-cell imaging of phosphorylation) assay”. In brief, cells were sorted into 
droplets of PBS on poly(t-lysine)-coated slides, then fixed, permeabilized and 
stained with phospho-specific (Ser 139) histone H2AX antibody (Biolegend). 
After being washed, the cells were stained with a secondary Alexa Fluor 488- 
conjugated antibody and 4,6-diamidino-2-phenylindole. Quantification of 
y-H2AX foci was performed by fluorescence microscopy and analysed statist- 
ically with the Mann-Whitney U-test. 


Figure 3 | NER deficiency limits LT-HSC function 
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Figure 4 | NHEJ deficiency and endogenous damage accumulation in stem 
cells with age. a—c, Competitive transplantation of LT-HSCs from 16-week 


Ku80*’* (black) or Ku80/— (grey) donors (n = 9 and n = 7 recipients, 


respectively) showing myeloid (Macl~) reconstitution (a) and granulocyte 
chimaerism (b). c, Donor LT-HSC frequency from the recipients described 


in a. d, Proliferative potential of Ku80°’* (black bars) or Ku80’ (grey 


bars) KLSflk2 ~ cells. e, Annexin V-positive cells in the experiment described 


in d. Significant differences (Student’s t-test) are indicated as follows: 
asterisk, P < 0.05; two asterisks, P< 0.005. Error bars denote s.e.m. 


f, Immunostaining of y-H2AX in c-Kit* Scal “lin” cells showing irradiation 
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Full Methods and any associated references are available in the online version of 
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METHODS 

Mice. Strains of mice used included XPD!!” mice, which model the human 
segmental progeroid syndrome trichothiodystrophy (TTD) and have a mutation 
of the XPD helicase with pleiotropic functional deficits including a partial defect 
in NER*!2, KU80~'~ mice, which are defective in NHEJ and in double-strand 
break repair as a result of ablation of KU80 (refs 33-35), and mTR /— mice, 
which are defective in telomere maintenance asa result of the targeted disruption 
of the telomerase RNA component***’. Because mice have long telomeres, 
the telomerase mTR ‘~ mutants were backcrossed for several generations to 
allow telomeres to shorten enough to become debilitating (late generation). 
When initially reported, mTR ’ mice on a mixed background needed to be 
backcrossed for four to six generations for telomere dysfunction to be mani- 
fested*°*’**, However, because mTR is haploinsufficient for telomere mainten- 
ance*’, and the BL6 strain to which we backcrossed the mutation have shorter 
telomeres than other inbred strains*°, the mTR /~ mice could only be interbred 
through two (G,) to three generations (G3) before becoming sterile and showing 
overt signs of telomere dysfunction such as cachexia and reduced life span 
(Supplementary Fig. 5) due to critically short telomeres and increased chro- 
mosome instability*'. In all experiments, age-matched wild-type littermate con- 
trols were used when possible. For the late-generation telomerase mutants whose 
breeding scheme did not produce wild-type littermate controls, age-matched 
controls of the same genetic background were used. All mice were on a C57BL/6 
background and were maintained at the Stanford University Laboratory Animal 
Facility. 

Absolute numbers of LT-HSCs. Absolute numbers were calculated from the 
bone marrow cellularity of the four hindlimb bones, the bone marrow frequency 
of LT-HSCs and the weight of each mouse, to control for differences in animal 
size. 

Secondary transplantation. Serial transplantation was performed in several 
ways. Primary recipients were transplanted either competitively with 50 LT- 
HSCsas described above, or non-competitively with about 5 X 10° bone marrow 
cells obtained from test or control mice. In the latter case, flow cytometry stain- 
ing for LT-HSCs was first performed to ensure that the frequency of LT-HSCs in 
bone marrow from test and control mice was comparable so that we would be 
transplanting stem cell equivalents. Variations in frequency in bone marrow 
were adjusted for before transplantation. For secondary transplants, if primary 
recipients were competitively transplanted we sorted donor-derived (test or 
control) LT-HSCs from the bone marrow and competitively transplanted 25 
or 100 of these cells into the secondary host as described above. If primary 
recipients had been transplanted non-competitively, 10° bone marrow cells were 
transplanted into lethally irradiated secondary recipients. Peripheral blood ana- 
lysis of secondary recipients was performed as described above. 

Self-renewal determination. Self-renewal was determined as described prev- 
iously**. In brief, primary lethally irradiated recipients (CD45.1) were trans- 
planted non-competitively with stem cell equivalents of test or control bone 
marrow. Stem cell equivalents were determined by staining donor bone marrow 
for LT-HSCs to determine frequency before transplantation. At 5-7 months after 
transplantation, primary recipients were sacrificed and the frequency of donor- 
derived (CD45.2) LT-HSCs in bone marrow was determined by using an eight- 
colour flow cytometric protocol with simultaneous detection of CD45.1, 
CD45.2, lineage, c-Kit, Scal, CD34 and flk2, along with discrimination of dead 
cells (propidium iodide). In each experiment three to five recipients transplanted 
with test or control cells were assayed. 
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The ATM repair pathway inhibits RNA polymerase | 
transcription in response to chromosome breaks 


Michael Kruhlak', Elizabeth E. Crouch**, Marika Orlov’*, Carolina Montafo~*, Stanislaw A. Gorski’, 
André Nussenzweig’, Tom Misteli°, Robert D. Phair* & Rafael Casellas” 


DNA lesions interfere with DNA and RNA polymerase activity. 
Cyclobutane pyrimidine dimers and photoproducts generated by 
ultraviolet irradiation cause stalling of RNA polymerase II, acti- 
vation of transcription-coupled repair enzymes, and inhibition of 
RNA synthesis’. During the S phase of the cell cycle, collision of 
replication forks with damaged DNA blocks ongoing DNA rep- 
lication while also triggering a biochemical signal that suppresses 
the firing of distant origins of replication**. Whether the tran- 
scription machinery is affected by the presence of DNA double- 
strand breaks remains a long-standing question. Here we monitor 
RNA polymerase I (Pol I) activity in mouse cells exposed to geno- 
toxic stress and show that induction of DNA breaks leads to a 
transient repression in PolI transcription. Surprisingly, we find 
PolI inhibition is not itself the direct result of DNA damage but is 
mediated by ATM kinase activity and the repair factor proteins 
NBS1 (also known as NLRP2) and MDC1. Using live-cell imaging, 
laser micro-irradiation, and photobleaching technology we dem- 
onstrate that DNA lesions interfere with Pol I initiation complex 
assembly and lead to a premature displacement of elongating 
holoenzymes from ribosomal DNA. Our data reveal a novel 
ATM/NBS1/MDCl1-dependent pathway that shuts down ribo- 
somal gene transcription in response to chromosome breaks. 

To study the effects of DNA double-strand breaks (DSBs) on RNA 
synthesis we monitored transcription of ribosomal genes in cells 
exposed to genotoxic stress. The large copy number and tandem array 
distribution of ribosomal transcription units provide an ideal system 
to measure the kinetics of transcription in real time’. We exposed 
mouse embryonic fibroblasts (MEFs) to increasing doses of ionizing 
radiation and ongoing ribosomal RNA synthesis was assessed by 
fluorouridine (FUrd) incorporation in in situ run-on assays (Supple- 
mentary Fig. 1). In non-irradiated MEFs, we observed high FUrd 
incorporation at nucleolar sites (Fig. 1a). In contrast, we found that 
exposure to y-irradiation led to a pronounced decrease in nuclear 
FUrd incorporation in a dose-dependent manner. Fibroblasts exposed 
to 2.5 Gy of irradiation showed a 30% reduction, whereas exposure to 
10 Gy of irradiation reduced FUrd by nearly 60% (Fig. 1a), indicating 
that induction of DNA DSBs results in Poll transcriptional arrest. 
Consistent with these results, irradiated cells showed segregation of 
upstream binding factor 1 (UBF1; also known as UBTF) into fibrillar 
caps, a characteristic feature of transcriptionally inactive cells® (Sup- 
plementary Fig. 2). To investigate the dynamics of Pol I inhibition, we 
exposed primary MEFs to y-irradiation or 20M etoposide and 
assessed transcription at twenty-minute intervals following treatment. 
Both irradiation and etoposide treatment led to a transient inhibition 
of rRNA synthesis. Following a ~60% decrease at 20 min, FUrd incor- 
poration 60 min post-DNA damage was indistinguishable from that 


of untreated cells (Fig. 1b and Supplementary Fig. 3), indicating Pol I 
transcription is restored approximately within an hour of genotoxic 
stress. We conclude that DNA breaks elicit a transient block in PolI 
rRNA synthesis. 
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Figure 1| DNA DSBs inhibit Pol I transcription. a, Primary MEFs were 
untreated (left) or exposed to 10 Gy of irradiation (right) and ongoing rRNA 
synthesis was monitored by FUrd run-on assays. Lower graph represents 
percentage of FUrd incorporation in fibroblasts exposed to different doses of 
irradiation. The mean (represented by cross lines) of anti-FUrd antibody 
fluorescence in non-irradiated (0 Gy) cells was set to 100%; scale bar, 15 um. 
b, FUrd incorporation in wild-type MEFs assessed at 20, 40 and 60 min 

post irradiation (5 Gy). c, MEF nucleoli were exposed to localized laser micro- 
irradiation (white circles; differential interference contrast image). YH2AX 
(green) and FUrd incorporation (red) were assessed by immunocytochemistry 
(scale bar, 2 jm). Bar graph quantifies FUrd incorporation in undamaged and 
damaged nucleoli. Values represent mean = s.d. (n = 20). 
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Induction of DNA damage in the S phase of the cell cycle inhibits 
DNA polymerases at a distance from damaged sites’, whereas tran- 
scription-coupled repair blocks RNA polymerase activity only at 
ultraviolet radiation (UV)-damaged genes’. To investigate whether 
PolI transcription is globally or locally inhibited by DNA damage, we 
introduced localized DNA DSBs using laser micro-irradiation in 
Hoechst-sensitized cells*. To ensure a moderate DSB density, micro- 
irradiation was calibrated to introduce approximately a single lesion 
per megabase (Mb) of DNA®, or one DSB for every 23 ribosomal 
transcription units. We found that although Poll transcription was 
blocked in micro-irradiated nucleoli, transcriptional activity contin- 
ued undisrupted in neighbouring nucleoli (Fig. 1c). We conclude 
that Pol] transcription is only inhibited in proximity to DNA DSBs. 

Previous studies have shown that the DNA-dependent protein kinase 
(DNA-PK; also known as PRKDC) and Ku (also known as XRCC5) 
protein heterodimer interferes with Pol I activity near DSBs in vitro’. To 
investigate whether DNA-PK inhibits rRNA synthesis in vivo we irra- 
diated Ku80 ‘~ MEFs and assessed Pol I transcription by FUrd run-on 
assays. As in wild-type cells, FUrd incorporation was abolished in irra- 
diated Ku80 ‘~ fibroblasts (Fig. 2a), demonstrating that Ku is dispens- 
able for nucleolar transcriptional arrest. Likewise, the JNK2 (also 
known as MAPK9) signalling pathway, shown to inhibit rRNA syn- 
thesis in response to cellular stress’, was dispensable (Supplementary 
Fig. 4). In addition, proteosome activity, which degrades stalled 
polymerases during transcription-coupled repair'', was not required 
to shut down rRNA synthesis following induction of DNA breaks 
(Supplementary Fig. 4). To investigate whether other DNA repair path- 
ways were involved in blocking Pol] activity we screened several DNA 
repair deficient MEFs. Surprisingly, we found that Atm ‘~ fibroblasts 
are unable to block Pol I post y-irradiation (Fig. 2a). Similarly, Atm-null 
cells cultured in the presence of etoposide or exposed to localized 
micro-irradiation failed to downregulate Poll transcription (Supp- 
lementary Fig. 5), and no changes were seen in UBF! localization in 
irradiation-treated Atm ‘~ MEFs (Supplementary Fig. 5). Altogether, 
these data argue that Pol I transcription is not blocked by DNA damage 
itself, but by the action of DNA repair enzymes. 

ATM kinase activity is essential for cellular signalling in response 
to DNA breaks'”"*. We found that pre-treatment of wild-type MEFs 
with the specific ATM kinase inhibitor KU55933 leads to radioresis- 
tant rRNA synthesis (Fig. 2b), indicating ATM phosphorylation is 
required for Pol I inhibition. To assess directly the role of DNA repair 
substrates involved in the ATM pathway, we screened MEFs that are 
deficient for various repair factors. We found that Pol I transcription 
is efficiently blocked in irradiated cells lacking 53BP1 (also known as 
TRP53BP1), BRCA1 or histone H2AX (also known as H2AFX) (Fig. 
2c). Compared to wild-type fibroblasts, however, resumption of rRNA 
synthesis was significantly delayed in DNA-repair-compromised cells 
(Fig. 2c), suggesting that restoration of transcription in damaged nuc- 
leoli is dependent on successful repair of DNA lesions. In support of 
this idea, resumption of rRNA synthesis in irradiated wild-type cells 
(Fig. 1b) correlates with the disappearance of phosphorylated H2AX 
(yH2AX) foci’* (Supplementary Fig. 6). 

Analogous to Atm ‘~, Mdcl-null fibroblasts and to a lesser extent 
MEFs expressing a hypomorphic mutation in Nbs1 (Nbs1°”°; ref. 
15) showed radioresistant rRNA synthesis (Fig. 2c). The incomplete 
penetrance of the Nbs1 allele in this assay is probably due to the 
residual capacity of NBS1°°”“° to activate ATM and trigger a sub- 
optimal DNA damage response’. These results demonstrate that 
ATM kinase activity, MDC1 and NBSI are required for blocking 
Poll transcription in response to DNA breaks, whereas Ku, BRCA1, 
53BP1 and histone H2AX are dispensable. 

Using Pol I-green fluorescent protein (GFP) fusion proteins and 
photobleaching techniques’ (Supplementary Fig. 7) we next investi- 
gated the dynamics of Poll nucleolar entry, initiation complex 
assembly at the promoter and holoenzyme elongation in the presence 
or absence of micro-irradiation. We first monitored rRNA transcrip- 
tion with GFP-labelled RPA194, one of two PolI catalytic subunits 
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that directly interacts with DNA‘. In agreement with published 
observations’, RPA194—GFP recovery in undamaged nucleoli was 
biphasic with a rapid recovery phase followed by a slow component 
~50s after photobleaching (Fig. 3b, control curve). The fast recovery 
phase represents predominantly PolI subunits and auxiliary factors 
that rapidly and continuously exchange between the nucleoplasm 
and the nucleolus’, whereas the slower phase largely represents assem- 
bly of initiation complexes at the promoter, and elongating holo- 
enzymes” (Fig. 3a). In marked contrast, the initial RPA194—GFP 
recovery phase in micro-irradiated nucleoli reached its maximum 
~40s post-bleaching and was followed by a time-dependent decline 
in fluorescence (Fig. 3b). This declining phase became less pronounced 
as the time between DNA damage and photobleaching was increased 
and eventually it was transformed to a near plateau when photobleach- 
ing was performed 300s post damage (Fig. 3c). Analogous kinetics 
were observed for the transcriptional initiation factor-IA (TIF-IA; also 
known as RRN3) and the polymerase-associated factor 53 (PAF53; 
also known as POLRIE) (Supplementary Fig. 8). These changes in 
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Figure 2 | Atm~’~, Nbs1°°”“° and Mdc1 /~ fibroblasts show radioresistant 
rRNA synthesis. a, Ku80 ‘~ and Atm ‘~ primary MEFs were exposed to 
10 Gy of irradiation and ongoing rRNA transcription was monitored by FUrd 
incorporation (scale bar, 15 um). b, Atm*'* MEFs were untreated (0 Gy) or 
irradiated (5 Gy) in the presence or absence of the ATM kinase inhibitor 
KU55933, and Pol I activity was assessed by FUrd nuclear run-ons. ¢, Upper 
panel, wild type or MEFs deficient in MDC1, H2AX, BRCA1, Ku80, 53BP1 or 
expressing the human NBS1 mutant protein (Nbs1°°’™*) were irradiated with 
5 Gy, and FUrd was assessed over time. Values represent the mean = s.d. 

(n = 200; *P < 0.0001; versus 53BP1); for comparative purposes the mean 
value of each cell line at time 0 was set to 100%. Lower panel, H2ax'", 


Mdc1~/~ and Nbs1°°’* were untreated or irradiated (5 Gy) and Pol I activity 
was determined by FUrd incorporation 30 min post DNA damage. 
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Poll kinetics were not influenced by the presence of Hoechst 
dye because similar results were obtained in nucleoli damaged by 
multi-photon micro-irradiation’* in the absence of a sensitizing dye 
(Supplementary Fig. 9). In addition, experiments in Atm-null cells 
showed little change in RPA194—GFP recovery in damaged or undam- 
aged nucleoli (Fig. 3e). In the presence of ATM, 300 s recovery curves 
were in all cases analogous to those observed in transcriptionally inact- 
ive mitotic cells (Fig. 3d, metaphase curve), demonstrating that PolI 
transcriptional arrest is complete within 5 min of DNA DSB induction. 

To elucidate the mechanistic details of rDNA transcriptional arrest 
we analysed our fluorescence recovery after photobleaching (FRAP) 
data on the basis of a previous model of Pol I kinetics’. Using standard 
principles of physical chemistry, four possible mechanistic theories 
were tested: (1) inhibition of Pol I entry to damaged nucleoli; (2) inhibi- 
tion of initiation complex assembly at rDNA promoters; (3) block of 
Pol I elongation; and (4) premature displacement of elongating holoen- 
zymes from rDNA. We found that each theory was able to explain some 
features of our Poll experimental results, but no single mechanism 
could simultaneously fit all FRAP curves (Supplementary Fig. 10 and 
Supplementary Information). However, a model that evaluated both 
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inhibition of initiation complex assembly and a progressive displace- 
ment of elongating holoenzymes from rDNA accounted for all FRAP 
data simultaneously (Fig. 4a and Supplementary Information). 

In the assembly/displacement model the time required to halt tran- 
scription of 90% of rDNA genes within the damaged area was 93s, 
implying that some ribosomal transcription units are not immediately 
affected by DNA breaks within the damaged area. Consistent with this 
idea, the elongation rate of active holoenzymes before completion of 
transcriptional arrest (before 300s post micro-irradiation) was ana- 
logous to that measured in unirradiated nucleoli (104 nucleotides per 
second; Supplementary Information and refs 5, 19). These kinetics 
could be explained if rDNA genes distant from DNA breaks are 
repressed at a later time point than those found in close proximity 
to a lesion. Continual transcription in the first few minutes following 
micro-irradiation would thus allow unbleached Pol I-GFP molecules 
to enter the elongating pool of rDNA genes not yet inhibited by DNA 
damage (Fig. 4b). In this scenario the fluorescence decline observed in 
20 s-post-damage FRAP curves represents the eventual loss of these 
unbleached holoenzymes as they either terminate transcription or are 
prematurely displaced from rDNA. To empirically validate this idea, 
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Figure 3 | Poll dynamics at damaged nucleoli. a, Schematic representation 
of the major PolI fractions in transcriptionally active nucleoli. Their 
approximate recovery during FRAP analysis is stated: free subunit pool, fast 
recovery; pre-initiation complex and elongating pools, slow recovery. 

b, c, FRAP analysis of GFP—RPA194 or RPA43-GFP in control and micro- 
irradiated nucleoli of CMT3 monkey kidney cells. Photobleaching was 
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performed either 20s, 150 s, or 300s post damage. d, FRAP analysis of 
undamaged nucleoli from CMT3 cells in prophase or metaphase expressing 
RPA43-GEP. e, FRAP and damage-FRAP analysis of Atm '~ MEFs 
expressing RPA194—GFP. Values in all panels represent mean + s.d. 

(n = 10-20 cells per condition; *P < 0.002). 
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Figure 4 | Displacement of Pol! from rDNA in response to DNA breaks. 

a, RPA194—GFP FRAP results (Fig. 3b) are accounted for by a kinetic model 
postulating simultaneous inhibition of initiation complex assembly and 
displacement of elongating holoenzymes from damaged rDNA. b, The 
assembly/displacement model proposes that immediately following 
micro-irradiation and photobleaching, recruited, unbleached Pol I-GFP 
(represented by green ovals) enters the elongating pool of rRNA transcription 
units (RTUs) distant from lesion sites. Eventually these holoenzymes are 
dislodged from rDNA by an ATM/NBS1/MDC1-dependent mechanism. 


we micro-irradiated nucleoli from CMT3 cells expressing RPA194— 
GFP and recorded fluorescence loss by time-lapse microscopy (Fig. 4c, 
left panel). As predicted by our mathematical modelling, introduction 
of DNA DSBs led to a progressive loss of RPA194-GEFP, so that at 250 s 
post-damage around 40% of RPA194-GFP molecules were excluded 
from micro-irradiated nucleoli compared to non-damaged sites 
(Fig. 4c, right panel). Importantly, chromatin immunoprecipitation 
analysis showed a marked reduction in the interaction of PolI with 
promoter and transcription areas in irradiated Atm*'* cells whereas 
no changes were found in Atm-null fibroblasts (Fig. 4d). 

Our studies indicate that activation of ATM, NBS1 and MDC1 
following DNA damage interferes with Pol I initiation complex assem- 
bly and leads to a progressive displacement of elongating holoenzymes 
from rDNA. A priori, transcriptional arrest around DNA breaks 
might ensure efficient DNA end-processing by preventing transcrip- 
tion across DNA lesions. However, mounting evidence also indicates 
that Pol! regulation has a critical role in monitoring cellular stress”°. 
Conditions that inhibit PolI transcription, such as UV irradiation, 
result in nucleolar structural changes and release of ribosomal proteins 
that suppress the MDM2 protein, leading to the accumulation of p53 
(also known as TRP53) and apoptosis”’. In response to DNA DSBs, the 
ATM kinase phosphorylates a multitude of key substrates, including 
p53 (ref. 22). It will be important to determine whether the ATM 
pathway also regulates p53 by inhibition of PolI rRNA synthesis. 


METHODS 


The Methods section describes the following: (1) how DNA DSBs were 
introduced in living cells using DNA micro-irradiation as well as y-irradiation 


c, RPA194-GFP fluorescence monitored for 250 s (right panel) in undamaged 
and damaged nucleoli (left panel; scale bar, 2 |um). Blue line in graph 
represents the loss of PolI as predicted by the assembly/displacement model; 
values represent the mean + s.d. (n = 15; *P < 0.0001). d, Interaction of 
RPA194 with ribosomal gene promoters, ribosomal genes, and intergenic 
domains in Atm*'* or Atm '~ fibroblasts as determined by chromatin 
immunoprecipitation and quantitative PCR. Cells were either untreated 
(undamaged) or exposed to 10 Gy of y-radiation (damaged). Values represent 
the mean + s.d. (n = 4). 


(a step-by-step description on how DNA DSB density was calculated is also 
provided); (2) confocal microscopy conditions and FRAP analysis; (3) cell cul- 
ture conditions; and (4) chromatin immunoprecipitation assays. PolI kinetic 
modelling and statistical analysis are given as Supplementary Information. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

DSB density at nuclear and nucleolar focused domains. DNA DSBs were 
introduced into cell nuclei using a UV laser on a Zeiss LSM 510 META confocal 
microscope. Nucleolar or nuclear areas of 15.2 um? were consistently exposed to 
laser micro-irradiation at 0.86nJ per pixel using 50% laser output. Given a 
nuclear volume of 500 ttm? and 6 X 10° base pairs (bp) of DNA in non-replicated 
nuclei, we calculated a density of 1.2 X 10’ bp of DNA per nuclear um*. Thus, 
approximately 1.7 X 10*bp of DNA were exposed to micro-irradiation in 
15.2 um’, In y-irradiated cells, one Gy results in 35 DSBs approximately. From 
the calculated equivalence of UV laser micro-irradiation to Gy of y-irradiation, 
the DNA at damaged sites was exposed to 5.25 Gy of irradiation, inducing ~ 185 
DSBs. These conditions correspond to a density of 1 DSB for every 940 kb of 
DNA, or 23 ribosomal transcription units. 

To introduce DNA breaks by laser micro-irradiation using near-infrared 
multi-photon microscopy, we used a Zeiss LSM510 NLO microscope (Carl 
Zeiss) equipped with a 63 X (numerical aperture 1.4) Plan-Apochromat object- 
ive lens to focus 800 nm laser light from a Ti:sapphire femtosecond pulsed laser 
(Coherent) mode-locked at wavelength 800 nm (116 MHz, 12 mW output at the 
sample). Laser exposure was restricted to 17.6 jum” regions with a total exposure 
time of 60 ms. Subsequent FRAP time series were performed as mentioned 
above, except a consistent 6.2 tm? circular area was used for photobleaching 
the GFP. Images were collected with 0.14 jim x-y pixel sampling. 

Microscopy equipment and settings. Samples were imaged using a 40 
C-Apochromat (numerical aperture 1.2) water immersion lens coupled to a 
Zeiss LSM510 META confocal microscope (Carl Zeiss MicroImaging) and the 
multi-time macro (version z) accompanying the Zeiss LSM software. To intro- 
duce DNA DSBs by UV laser micro-irradiation, we used configurations specific 
for exposing living transfected cells to a 364 nm laser (Coherent Enterprise II) at 
intensities equivalent to approximately 0.86 nJ per pixel. As the first block in the 
multi-time macro, we performed one pre-micro-irradiation scan. We then 
micro-irradiated a restricted region of interest with the 364nm laser approxi- 
mately 4 tm? in size to cover a single or partial fibrillar center, and finished with a 
post-micro-irradiation scan. Within the second block of the multi-time macro, 
fluorescence recovery after photobleaching experiments were performed using 
FRAP specific configurations. The photobleaching was restricted to a consistent 
3.14 um” area within the UV laser exposed region of interest within the nuc- 
leolus. GFP was excited and photobleached using the 488nm line of Argon 
multi-line laser (Lasos). Excitation intensity was set at 2% of the laser intensity 
used to ablate GFP fluorescence in the photobleaching step. Five pre-bleach 
images were collected; after bleaching, cells were monitored at 2s intervals for 
250s. To measure the fluorescence recovery, the mean fluorescence intensities of 
both a region within the bleached area and in the total nuclear area were 
recorded. Background fluorescence was measured from control non-transfected 
cells. The relative intensity was normalized by first subtracting mean background 
fluorescence from the total nuclear area fluorescence and the signal in the region 
of interest. Then, the fluorescence in the region of interest was divided by the 
nuclear area fluorescence to obtain a relative fluorescence ratio. These values 
were normalized to one by dividing by the mean of the fluorescence intensity 
ratios in the five pre-bleach scans. All images were acquired using Zeiss LSM 
version 3.2 (at 8 bit depth). The scaling was 0.22 [1m X 0.22 um and the stack size 
of 1,024 X 1,024, 230.3 um X 230.3 tm. The pixel time was 1.60 Us. Unfixed live 
cells were maintained at 37 °C in phenol-red-free DMEM during imaging. The 
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filter used was LP 560 and the objective lens a 40 C-Apochromat (numerical 
aperture 1.2). TIFF images were further processed using Canvas software. The 
brightness was adjusted to 65 and the contrast to 40 in all files, with the exception 
of images shown in Fig. 4c, for which both the brightness and contrast were 
adjusted to 25. 

Cell culture. CMT3, Ku80'~ MEFs, ATR hypomorphic human cells (Seckel) 
and Atm '~ MEFs were grown in DMEM (American Tissue Culture Collection) 
supplemented with 10% FCS (ATCC), 1% penicillin and streptomycin at 37 °C 
in 5% CO . Cells were transfected by electroporation using an ECM 830 electro- 
porator with 5 ug RPA194—GFP, PAF53—GFP or RPA43—GFP constructs and 
15g of sheared salmon sperm DNA. Fluorouridine incorporation was per- 
formed by incubating transfected cells with 2 mM FUrd (Sigma) DMEM media 
for 15min at 37°C and 5% COs. Cells were then fixed with 2% parafor- 
maldehyde for 15 min at room temperature, washed twice with PBS, and stored 
in PBS at 4°C until data acquisition. To introduce DNA DSBs by laser micro- 
irradiation, cells were incubated with Hoechst 33342 dye (0.8 pl ml~ ! Molecular 
Probes) for 20 min, the media was replaced with new media containing 2 mM 
FUrd, and cells were incubated for 10 min at 37°C and 5% CO). Cells were then 
damaged at specified nucleoli using the 364 nm laser, and returned to the incub- 
ator for a further 10 min, and finally fixed in 2% PFA. Cells exposed to 10 Gy of 
y-irradiation were incubated under normal growth conditions for 30 min, media 
was then replaced with DMEM containing 2mM FUrd for 15 min at 37 °C and 
5% CQy, and finally fixed in 2% PFA. FUrd was visualized by incubating cells 
with monoclonal anti-BrdU (Sigma) for 1h, washing twice with PBS, and then 
staining for 30min with Goat-anti-mouse coupled to Alexa 546 (Molecular 
Probes). Damaged cells were visualized using anti-phospho-Histone H2AFX- 
FITC (Upstate). 

Chromatin immunoprecipitation. 293T or Atm /~ fibroblasts were transfected 
with Flag tagged RPA194. Cells (4 X 10’) were exposed to y-irradiation (5 or 
10 Gy), incubated for 15 min at 37 °C, and crosslinked by formaldehyde (0.25%) 
for 10 min. Cells were then washed with PBS and resuspended in 1 ml of high 
magnesium buffer (10 mM HEPES, pH 7.5, 0.88 M sucrose, 12 mM MgCl, and 
1mM dithiothreitol, plus protease inhibitors). Nucleoli were then released by 
sonicating on ice (8 bursts of 10s at 12% amplitude) using a Branson digital 
sonifier. The release of nucleoli was monitored microscopically. Nucleoli were 
resuspended in 1.0 ml of low magnesium buffer (10 mM HEPES, pH 7.5, 0.88 M 
sucrose, 1mM MgCl, and 1 mM dithiothreitol, plus protease inhibitors), reso- 
nicated (10s at 12% amplitude), and resuspended in 0.2 ml of 20/2TE (20 mM 
Tris, pH 8.0, 2mM EDTA), after which a 1/10 volume of 20% SDS was added. 
Samples were incubated at 37 °C for 15 min and sonicated (7 bursts of 10 s each at 
12% amplitude) following addition of 0.8 ml of 20/2TE. The resulting sheared 
nucleolar chromatin was then diluted tenfold in 0.01% SDS, 1.1% Triton X-100, 
1.2mM EDTA, 16.7 mM Tris-HCl, pH 8.1, 167 mM NaCl, and protease inhibi- 
tors. From this suspension 100 ul were saved as input DNA. To reduce non- 
specific background, samples were nutated for 30min with 400 ul of 50% 
slurry of Protein G agarose beads pre-equilibrated with bovine serum albumin 
and salmon sperm DNA. Samples were then pelleted and incubated overnight 
with 25 yl of monoclonal anti-Flag antibody (F1804, Sigma). A 50% slurry of 
Protein G agarose beads (150 1) was added to the mixture and nutated for 1 h at 
4°C. Immunoprecipitated DNA was recovered following Upstate protocol. DNA 
concentration was measured using Picogreen (Molecular Probes) and FLUOstar 
Optima fluorescence software. Samples were resuspended to 0.015 ng ml, and 
1 pl of DNA was used as a template for quantitative PCR. 
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Control of DNA methylation and heterochromatic 
silencing by histone H2B deubiquitination 


Vaniyambadi V. Sridhar'*, Avnish Kapoor’*, Kangling Zhang’, Jianjun Zhu°’, Tao Zhou’, Paul M. Hasegawa’, 


Ray A. Bressan® & Jian-Kang Zhu' 


Epigenetic regulation involves reversible changes in DNA methyla- 
tion and/or histone modification patterns’’. Short interfering 
RNAs (siRNAs) can direct DNA methylation and heterochromatic 
histone modifications, causing sequence-specific transcriptional 
gene silencing’**”. In animals and yeast, histone H2B is known 
to be monoubiquitinated, and this regulates the methylation of 
histone H3 (refs 10, 11). However, the relationship between histone 
ubiquitination and DNA methylation has not been investigated. 
Here we show that mutations in an Arabidopsis deubiquitination 
enzyme, SUP32/UBP26, decrease the dimethylation on lysine 9 of 
H3, suppress siRNA-directed methylation of DNA and release het- 
erochromatic silencing of transgenes as well as transposons. We 
found that Arabidopsis histone H2B is monoubiquitinated at 


lysine 143 and that the levels of ubiquitinated H2B and trimethyl 
H3 at lysine 4 increase in sup32 mutant plants. SUP32/UBP26 can 
deubiquitinate H2B, and chromatin immunoprecipitation assays 
suggest an association between H2B ubiquitination and release 
of silencing. These data suggest that H2B deubiquitination by 
SUP32/UBP26 is required for heterochromatic histone H3 methy- 
lation and DNA methylation. 

We have previously established a transcriptional gene silencing 
(TGS) system in which a low level of siRNAs does not result in 
TGS in the wild type as a result of the active DNA demethylation 
activity of ROS1 (refs 12, 13). In ros1-1 mutant plants, however, the 
siRNAs cause promoter DNA hypermethylation at the RD29A-LUC 
(firefly luciferase reporter driven by the stress-responsive RD29A 
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Figure 1 | Effect of sup32 mutations on transcriptional gene silencing, DNA 
methylation and histone H3 methylation. a, sup32-1 suppresses RD29A- 
LUC transgene silencing in ros1-1. Shown are luciferase images of plants 
treated with cold stress. The colour scale at the right shows the luminescence 
intensity from dark blue (lowest) to white (highest). WT, wild type. b, sup32- 
1 suppresses the kanamycin sensitivity of ros1-1. Seeds were planted on MS 
medium supplemented with kanamycin (35 mg] _') and grown for 2 weeks. 
c, Northern analysis of transcript levels of the endogenous RD29A gene and 
the LUC and NPTII transgenes. COR15A was used as a control for stress 


treatment, and TUBULIN was a control for equal loading. ABA, abscisic 
acid. d, Cytosine methylation present at endogenous (top) and transgene 
(bottom) RD29A promoters as determined by bisulphite sequencing. N 
represents A, T or C. Detailed methylation data can be found in 
Supplementary Fig. 2a, b. e, ChIP analysis using antibodies against 
dimethyl H3K4 and dimethyl H3K9 at RD29A and CaMV 35S promoters in 
the wild type (WT), ros1-1, sup32-1ros1-1 and sup32-1. ACTIN was used as a 
control. ‘No Ab’ corresponds to chromatin treated without antibody. 
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promoter) transgene as well as the endogenous RD29A gene. The 
35S-NPTII (neomycin phosphotransferase II driven by the CaMV 
35S promoter) transgene is also silenced in ros1-1 because of 
heterochromatic spreading from the adjacent RD29A-LUC locus™. 
We performed a screen and found sup32-1 as a suppressor of 
the rosl-1 mutant. In sup32-1rosl-1 mutant plants, silencing of 
the RD29A-LUC and 35S-NPTII transgenes as well as the endo- 
genous RD29A was partly released (Fig. la—c, and Supplementary 
Fig. Sla, b). 

RNA blot analysis showed that the level of siRNAs generated from 
the transgene RD29A promoter was not affected by the sup32-1 muta- 
tion (Supplementary Fig. Sic, d). This result suggests that SUP32 is 
not important for siRNA biogenesis but is required for the activity of 
siRNAs to trigger TGS. Bisulphite sequencing revealed that the 
release of TGS at both the RD29A—LUC and endogenous RD29A loci 
was accompanied by a substantial decrease in promoter DNA methy- 
lation in CpNpG and CpNpN sequence contexts (Fig. 1d, and 
Supplementary Fig. $2). CpG methylation at the two loci was affected 
only slightly by the sup32-1 mutation. An allelic mutant, sup32-2, 
obtained from the Salk T-DNA (Agrobacterium-transferred DNA) 
collection showed similar phenotypes in releasing TGS in the ros1-1 
background (Supplementary Fig. S3a) and decreasing DNA methyla- 
tion (Fig. 1d, and Supplementary Fig. $2). In Arabidopsis, dimethy- 
lated H3K4 (lysine 4 of histone H3) and H3K9 are marks of active and 
silent chromatins, respectively’. We examined the status of histone 
H3 methylation at the RD29A and 35S promoters by chromatin 
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Figure 2 | Effect of sup32 mutation on the expression, and DNA and 
histone methylation patterns, of transposons. a, Expression analysis of the 
transposons by RT-PCR. PFK and ACTIN were used as internal controls. 
b, c, Bisulphite sequencing results showing percentage methylation. 

b, AtLINE1-4; c, A(MULE (top), AtGP1 (middle) and AtSN1 (bottom). 
Detailed methylation data can be found in Supplementary Fig. $4a—d. 

d, ChIP analysis using antibodies against dimethyl H3K4 and dimethyl 
H3K9 at transposon loci. ACTIN was used as a control. ‘No Ab’ corresponds 
to chromatin treated without antibody. 
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immunoprecipitation (ChIP) assays. As shown in Fig. le, these pro- 
moters in the wild type were associated with abundant H3K4 
dimethylation, whereas in ros1-1 they were preferentially associated 
with H3K9 dimethylation. In contrast, dimethylated H3K4 was 
enriched in sup32-1ros1-1 compared with dimethylated H3K9 at both 
the RD29A and 35S promoters. In the sup32-1 single mutant, the 
promoters were also preferentially associated with dimethylated 
H3K4. These results show that the SUP32 gene is required for 
H3K9 dimethylation, CpNpG and CpNpN methylation and TGS. 
The lack of difference in histone and DNA methylation at the trans- 
genes between the wild type and the sup32-1 single mutant suggests 
that SUP32 is important for silent but not already active loci. 

Transposons are prominent endogenous targets of TGS and are 
associated with high levels of DNA methylation and H3K9 dimethy- 
lation’*. The expression of transposon AtMULE] and retrotranspo- 
sons AtLINE1-4, AtGP1 and AtSNI was enhanced by the sup32 
mutation in the rosI-1 or wild-type background (Fig. 2a, and Sup- 
plementary Fig. S3b). Interestingly, AtGP1 expression was more in 
sup32-1 than in sup32-Iros1-1 (Fig. 2a), supporting the notion that 
ROS1 has a negative role in the TGS of this transposon”. For all of the 
transposons, CpG methylation levels were comparable in the wild 
type and in ros1-1, sup32-Iros1-1 and sup32-1, but there were sub- 
stantial decreases in the levels of CpNpG and CpNpN methylation in 
sup32-lrosI-1 and sup32-1 (Fig. 2b, c, and Supplementary Fig. S4a- 
d). As shown in Fig. 2d, in both the wild type and ros1-1, dimethy- 
lated H3K9 was enriched in these loci compared with dimethylated 
H3K4. In contrast, dimethylated H3K4 was enriched relative to 
dimethylated H3K9 in sup32-1ros1-1 and sup32-1, which is consist- 
ent with the increased expression of these loci in these mutant plants. 
Taken together, the results suggest that the sup32-1 mutation releases 
TGS of the transposons and this involves decreases in non-CpG DNA 
methylation and H3K9 dimethylation. 
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Figure 3 | SUP32 encodes a nuclear ubiquitin protease. a, UBP activity of 
SUP32/UBP26 in vivo in E. coli. The substrate UB-CEP 52 was co-expressed 
with XPress tag (— UBP26), XPress-tagged UBP26 (+ UBP26) or XPress- 
tagged UBP26C115S (+mUBP26). Upper panel, immunoblot probed with 
anti-ubiquitin antibody; lower panel, E. coli extracts probed with Xpress 
antibody showing expression of recombinant UBP26 and UBP26C115S. The 
antibody also cross-reacted with a smaller non-specific band (marked by an 
asterisk), which served as a loading control. b, UBP26—YFP fusion protein is 
localized in the nucleus. The picture shows YFP signal in the nuclei of root 
cells of Arabidopsis plants transformed with CaMV 35S promoter-driven 
UBP26-YFP. The inset is a close-up of one of the nuclei. c, Rescue of ubp26- 
1ros1-1 mutant phenotype by expressing CaMV 35S promoter-driven UBP26 
(+UBP26), but not by expressing UBP26(115)S mutant (-++mUBP26) 
cDNAs. Luciferase imaging was performed after 48 h of cold treatment. 
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Thermal asymmetric interlaced polymerase chain reaction (TAIL- 
PCR) was performed to clone the SUP32 gene. A single insertion was 
found in the third intron of AT3G49600/UBP26, which encodes 
a protein of 1,067 amino acids and shows similarity to ubiquitin- 
specific proteases (UBPs)’’. The sup32-2 allele has an insertion in the 
seventh exon (Supplementary Fig. 5a). UBPs are thiol proteases that 
specifically cleave ubiquitin carboxy-terminal glycine from covalently 
attached proteins'’. To test the UBP activity of SUP32/UBP26, we co- 
expressed XPress-tagged UBP26 together with the substrate UB-Cep52 
in Escherichia coli'*. As shown in Fig. 3a, ubiquitin monomer was 
detected in extracts from bacterial cells expressing UBP26 but not in 
those expressing XPress tag alone. A mutant version of UBP26 in which 
an active-site Cys 115 was changed to serine did not show the UBP 
activity, although the mutant protein was expressed at a similar level to 
the wild-type protein (Fig. 3a). A fusion protein of UBP26 and yellow 
fluorescent protein (YFP) was found to be localized in the nucleus 
in transgenic plants (Fig. 3b). Promoter-GUS (f-p-glucuronidase) 
experiments together with reverse transcriptase (RT)—PCR analysis 
indicated a ubiquitous expression pattern of UBP26 in plants (Sup- 
plementary Fig. S5b, c). To confirm that AT3G49600 is the SUP32/ 
UBP26 gene, we expressed AT3G49600 complementary DNA under 
the CaMV 35S promoter in ubp26-Iros1-1. Luciferase imaging of the 
transformants indicated that the mutant was complemented (Fig. 3c). 
When the UBP26C(115)S mutant cDNA was expressed, luciferase 
imaging indicated that it failed to complement the ubp26-Iros1-1 
mutant (Fig. 3c). These results show that UBP26 is a nuclear UBP 
and its deubiquitination activity is required for TGS. 

Recently, studies in animals and yeasts have found that ubiqui- 
tination of histone H2B and H2A is involved in transcriptional 
regulation”®. It is not known whether any Arabidopsis histone is ubiqui- 
tinated. We isolated total histones from Arabidopsis seedlings and used 
mass spectrometry to examine modifications in H2A and H2B. We did 
not detect any ubiquitinated H2A but found that H2B is monoubiqui- 
tinated at Lys 143 in the C terminus (AVTKFTSS) (Fig. 4a). The 
amino-acid sequences surrounding the H2B ubiquitination sites are 
highly conserved between yeast, human and Arabidopsis (Supplemen- 
tary Fig. S6). 

We examined the relative levels of ubiquitinated H2B (ubH2B) in 
the wild type, ros1-1, sup32-1ros1-1 and sup32-1 mutant plants. Anti- 
H2B antibodies detected not only an approximately 19-kDa protein 
corresponding to unubiquitinated H2B, but also an approximately 
27-kDa band corresponding to monoubiquitinated H2B (ubH2B) 
in Arabidopsis total histone preparations (Fig. 4b, top panel). 
Indeed, this approximately 27-kDa band was also recognized by 
anti-ubiquitin antibodies (Fig. 4b, middle panel). The levels of 
ubH2B were higher in swp32-Iros1-1 and sup32-1 than in ros1-1 or 
the wild type (Fig. 4b). Consistent with the western blot results, mass 
spectrometry analysis also indicated higher levels of ubH2B in sup32- 
1 and sup32-Iros1-1 mutant plants (data not shown). To determine 
whether UBP26 can deubiquitinate ubH2B, we expressed UBP26 ina 
yeast strain that expresses Flag-tagged H2B and carries deletions in 
the yeast UBP8 and UBP10 (ref. 19). As shown in Fig. 4c, the level of 
yeast ubH2B is decreased in the presence of UBP26 but not mutant 
UBP26 (in which the active-site Cys115 was changed to serine) 
compared with that in the vector control. We also investigated 
whether UBP26 can deubiquitinate Arabidopsis ubH2B in vitro. As 
shown in Fig. 4d (top and middle panels), the level of ubH2B in a 
sample of purified Arabidopsis histones is decreased by incubation 
with UBP26 but not with the buffer control or the inactive UBP26 
mutant, as revealed by probing with an anti-H2B antibody. Probing 
of these same samples with an anti-ubiquitin antibody showed the 
appearance of free ubiquitin accompanying the disappearance of 
ubH2B, in the sample treated with UBP26 but not with the buffer 
control or mutant UBP26 (Fig. 4d, bottom panel). Together, these 
data show that UBP26 can deubiquitinate ubH2B, and a loss-of- 
function mutation in UBP26 causes an elevated level of ubH2B in 
plants. 


LETTERS 


In a further examination of the consequence of increased mono- 
ubiquitination of H2B in ubp26-1 (that is, sup32-1), steady-state 
levels of histone H3 methylation were determined by using specific 
antibodies. As shown in Fig. 4e, ubp26-1 had higher levels of tri- 
methylated H3K4, and also a slight increase in dimethylated H3K4, 
compared with wild-type plants. In contrast, no increase was found 
for dimethylated H3K9 in the mutant. 

In the absence of antibodies specific for ubiquitinated H2B, we 
performed chromatin double immunoprecipitation (ChDIP) assays 
to test for an association between H2B ubiquitination and release of 
TGS. Chromatin was first immunoprecipitated with an anti-ubiquitin 
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Figure 4 | Analysis of H2B ubiquitination and its association with gene 
activation. a, Identification of H2B ubiquitination site by mass 
spectrometry. Detailed information is included in Methods. b, Western 
analysis of purified Arabidopsis histones with anti-H2B antibody (top 
panel), anti-ubiquitin antibody (middle panel) and anti-H3 antibody 
(bottom panel). c, Effect of UBP26 expression on histone H2B 
ubiquitination in vivo in yeast. Ubiquitination levels of Flag—H2B were 
assayed in yeast whole-cell lysates by using anti-Flag antibody (15% 
SDS-PAGE). The expression level of Flag-UBP26 was also determined in the 
lysates with the use of anti-Flag antibody (8% SDS—PAGE). 

d, Deubiquitination activity of UBP26 on Arabidopsis ubiquitinated H2B in 
vitro. Purified histones were incubated with buffer control (mock), UBP26 
(+UBP26) or UBP26C115S mutant protein (+ mUBP26). Deubiquitination 
was assessed by immunoblotting with anti-H2B antibody (top panel) or 
anti-ubiquitin antibody (middle and bottom panels). e, Western blot 
analysis of histone H3 methylation levels in Arabidopsis. f, ChDIP analysis 
by immunoprecipitating chromatin first with anti-ubiquitin antibodies and 
then with anti-H2B antibodies. The eluted DNA was subjected to PCR for 
AtGP1, AtlILINE1-4 and AtMULE1. Tubulin was used as an internal control. 
‘No Ab’ and Ub correspond to chromatin first treated without and with anti- 
ubiquitin antibodies, respectively. 
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antibody and eluted from the precipitate with ubiquitin. The super- 
natant was immunoprecipitated with an anti-H2B antibody, and 
the resulting DNA was eluted and subjected to PCR for transposon 
loci affected by the ubp26 mutation. As shown in Fig. 4f, we found 
an enrichment of AtGP1, A(MULEI1 and AtLINE1-4 in the double- 
immunoprecipitated chromatin from ubp26-1 plants, in which these 
transposons are transcriptionally active. In contrast, there was no 
enrichment in wild-type plants, in which the same transposons are 
inactive (Fig. 4f). These data show an association between H2B ubi- 
quitination and release of TGS. 

Our data show that histone H2B is monoubiquitinated in plants, 
and that the nuclear ubiquitin protease, UBP26, has a critical func- 
tion in non-CpG DNA methylation and TGS. The results implicate 
UBP26 in the deubiquitination of H2B in Arabidopsis and suggest 
that this deubiquitination is required for H3K9 dimethylation, which 
has been shown to direct CpNpG and CpNpN methylation”. In 
yeast and animals, H2B ubiquitination is known to be required for 
methylation of H3K4 (ref. 11) and H3K79 (ref. 22), leading to active 
transcription. Given the bulky size of ubiquitin (relative to acetyl, 
methyl or phosphoryl groups) and the role of ubiquitinated H2B in 
keeping chromatin in an open state'®, H2B deubiquitination is likely 
to be an early and crucial event in heterochromatin formation. 
UBP26-mediated deubiquitination may be an upstream event in 
the siRNA-directed TGS pathway. Alternatively, H2B deubiquitina- 
tion might not depend on siRNAs but the deubiquitination might be 
required to make the chromatin competent for siRNA-directed DNA 
methylation and TGS. In any event, the accumulation of active chro- 
matin marks such as ubH2B and trimethylated H3K4 as well as the 
release of TGS in the ubp26 mutant suggest that UBP26 has an 
important function in controlling DNA methylation and heterochro- 
matin formation by initiating the removal of active histone modi- 
fication marks. 


METHODS SUMMARY 


The sup32-1 mutant was isolated by screening a T-DNA mutagenized ros1-1 
population. The suppressor mutant was characterized by analysing the express- 
ion level using northern blots and RT-PCR, DNA methylation using bisulphite 
sequencing and histone modification patterns by using ChIP, at various trans- 
genic and endogenous loci. Histone 2B ubiquitination was determined by mass 
spectrometry and western blot analysis. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Plant growth, mutant screening, and cloning. The wild-type plants are in the 
C24 background and carry the homozygous RD29A-LUC transgene”*. A T-DNA 
population in the Arabidopsis thaliana mutant ros1-1 background was obtained 
after floral transformation of inflorescences with Agrobacterium GV3101 car- 
rying the binary vector ppUPERTAG™. Plants were grown in a controlled room 
at 22 °C with 16h of light and 8 h of darkness. Seedlings to be used for luciferase 
(LUC) imaging were planted on Murashige and Skoog (MS) medium and were 
stratified for 2-4 days at 4°C. About 200 T> seedlings from each 50-line pool 
were screened after cold (0 °C, 2 days) or treatment with 100 11M abscisic acid for 
3h), and RD29A-LUC expression was analysed as described”. Image acquisition 
(5 min) with use of a charge-coupled device system and processing were per- 
formed as described****. Mutants were also evaluated after being sown on MS 
medium supplemented with kanamycin. 

The genomic sequence flanking the T-DNA insertion was determined by using 
the thermal asymmetric interlaced PCR procedure” with primers corresponding 
to nested regions internal to the left border and degenerate primers. 

The sup32-1 mutant was backcrossed to ros1-1 to eliminate extraneous muta- 
tions and to confirm that the mutation was recessive. An additional allele (sup32- 
2) was also obtained from the ABRC Arabidopsis stock centre (SALK_024392.40. 
25.X). 

DNA and RNA analysis. DNA methylation assays and RNA blot analysis were as 
described"; 15-20 clones were analysed in the bisulphite sequencing experi- 
ments to determine the methylation status of a locus in each genotype. For 
RT-PCR analysis of transposons, total RNA was extracted from 15-day-old 
seedlings with Trizol (Invitrogen) and treated with DNasel and further purified 
with RNAeasy columns (Qiagen). For AtSN1 expression analysis, total RNA was 
extracted from inflorescences; 100 ng of RNA was used as input in RT-PCR 
reactions with a one-step RT—PCR kit (Qiagen). PCR conditions and primer 
sequences were as described previously". 

Chromatin immunoprecipitation. ChIP assays were performed on 20-day-old 
seedlings as described previously’®. Chromatin samples were immunoprecipi- 
tated with antibodies against H3 dimethyl Lys 4 or ubiquitin (Upstate), or with 
antibody against H3 dimethy] Lys 9 (gift from T. Jenuwein). The PCR conditions 
and the primer sequences were as described previously'*. For ChDIP, the chro- 
matin samples were immunoprecipitated with anti-ubiquitin antibody 
(Upstate) and then eluted with ubiquitin (1 pg pl '); 90% of the eluted sample 
was immunoprecipitated with anti-H2B antibody (Upstate), and DNA was 
eluted and detected as in standard ChIP assays. 

UBP activity assay. Ub-CEP 52 (p8185) was co-expressed with UBP26 and 
mutant UBP26 in E. coli. Exponential-phase cultures were induced for 15 min 
with 0.3 mM isopropyl B-p-thiogalactoside, and equal numbers of cells, deter- 
mined by Agoo, were suspended in Laemmli sample buffer and run on a 15% 
SDS-PAGE gel. Substrate and ubiquitin were detected with an anti-ubiquitin 
immunoblot'*. For UBP activity tests in yeast, Flag-UBP26 and Flag-UBP26 
mutant were separately cloned into the vector pYES3 (Invitrogen), transformed 
into yeast strain UCC6393 and detected with anti-Flag antibody as described 
previously’. The in vitro deubiquitination assay was performed with purified 
histones from Arabidopsis as described”. 

Arabidopsis histone isolation and analysis. Histone preparation from 
Arabidopsis was performed as described previously”; 10 11g of purified histone 
was separated on 15% acrylamide gel and blotted to Hybond enhanced chemi- 
luminescence (ECL) membrane (Amersham). The membrane was probed with 
antibodies against H2B, ubiquitin, trimethyl H3K4, dimethyl H3K4 or dimethyl 
H3K9 (Upstate) and detected with the enhanced ECL system (Amersham). The 
membrane was stripped in accordance with the manufacturer’s protocol before 
being reprobed with other antibodies (Upstate). For in vitro deubiquitination 
assays, Arabidopsis nuclei were extracted in an isolation buffer (0.25 M sucrose, 
25 mM HEPES pH 7.5, 3mM CaCl,, 10 mM NaCl, 1 mM phenylmethylsulpho- 
nyl fluoride (PMSF), 1 mM dithiothreitol (DTT), 0.25% Nonidet P40, leupeptin 
and pepstatin A). The nuclei were pelleted by centrifugation at 1,300gfor 10 min. 
The nuclei were washed twice with the isolation buffer. Histones were extracted 
from the nuclei with 0.2M HCI and the supernatant was precipitated with 
acetone. The isolated histones were renatured by dialysis against multiple 
changes of dialysis buffer (25mM HEPES pH7.5, 0.1mM EDTA, 12.5mM 
MgCl, 50mM KCI, 1mM PMSF, leupeptin, 1 mM DTT and 10% glycerol). 
Transgenic plant analysis. A cDNA clone containing the full-length UBP26 
open reading frame was amplified by RT-PCR from Arabidopsis inflorescence 
and cloned in a Gateway recombination vector in PDONR207 (Invitrogen) after 
PCR with the primers pDON-UBPFP and pDON-UBPRPI (to make a clone 
with a stop codon) and pDON-UBPFP and pDON-UBPRP2 (to make a clone 
without a stop codon). We refer to these clones as pDONRUBP1 and 
pDONRUBP2, respectively, and they were used for cloning UBP26 in destina- 
tion vectors after recombination. Primer information is available from the 
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authors on request. For complementation analysis, we also made an active site 
mutant, carrying a single amino-acid substitution at Cys 115 by using primers 5’- 
GTTGGCATAAGAAGTAGCACCCAAATTAGTCAG-3’ and 5’- GGTGCTAC- 
TTCTTATGCCAACAGTATACTTCAG-3’. These clones were then introduced 
into the binary vector pMDC32 (obtained from M. D. Curtis”), which has a dual 
35S promoter for constitutive expression of the gene after recombination. 

For protein localization, we made a UBP26-YFP translational fusion with the 
use of a pEG101 binary vector obtained from C. Pikaard and performed a 
recombination reaction to make pEG101UBP26. For construction of UBP26 
promoter-GUS transcriptional fusion, a 2-kilobase region upstream of the start 
codon was amplified with primers pDON-UBPProF and pDON-UBPProR from 
the BAC clone T9CS5. This PCR product was cloned in pDONR207 and subse- 
quently in the binary vector pMDC164 (obtained from M. D. Curtis”) after 
recombination, to make pMDC164UBPPro. 

All binary vectors for plant transformation were transferred to Agrobacterium 
tumefaciens GV3101 (pMP90) by electroporation. After selection with appro- 
priate antibiotics, Agrobacterium was grown overnight at 28 °C in Luria—Bertani 
medium and then used for floral dip transformation. The wild-type cDNA over- 
expression and the mutant cDNA overexpression constructs were transformed 
in the sup32-1ros1-1 background. The YFP fusion construct was transformed in 
the wild-type (C24 containing RD29A-LUC transgene) background, and the 
UBP26 promoter-GUS fusion was transformed in the Columbia wild type. 

Transgenic lines were selected for the transgene by sowing seeds on MS med- 
ium supplemented with hygromycin at a concentration of 30mgl_! (for 
pMDC32UBP and pMDC164UBPPro) or with glufosinate ammonium (for 
pEG101UBP). 

Mass spectrometry. Liquid chromatography—mass spectrometry (LC-MS) and 
liquid chromatography—tandem mass spectrometry (LC-MS/MS) experiments 
were performed on a hybrid quadrupole-time of flight (Q- TOF) mass spectro- 
meter (Waters), which was coupled with an Agilent HP1100 capillary HPLC 
running at a flow rate of 5 ul min ! on a Zorbax SB-Cyg (150 mm X 0.5mm, 
5 sm bore) column with a gradient from 2% mobile phase B (mobile A was 0.1% 
formic acid in water and mobile phase B was 0.1% formic acid in acetonitrile) to 
65% mobile phase B in 60 min. For LC-MS/MS experiments, the Q-TOF mass 
spectrometer was run in a survey mode as described previously. The raw data 
were converted into peak-list (PKL) files that were processed by MASCOT 
(http://www.matrixscience.com) for searching proteins and protein post-trans- 
lational modification sites. De novo sequencing, with the aid of Prospector- 
Product software (http://prospector.ucsf.edu), was also performed for the con- 
firmation of modification sites reported by MASCOT and for finding modifica- 
tion sites that MASCOT might fail to determine. For LC-MS experiments, the 
Q-TOF mass spectrometer was run in a single-positive electrospray ionization 
mode. 

Identification of H2B ubiquitination site. On the basis of our knowledge that 
ubiquitination occurs at Lys 123 of human H2B on the C-terminal peptide 
AVT'*KYTSS or at Lys 120 of yeast on the C-terminal peptide AVT'”°KYSSS, 
we reasoned that Lys 143 of plant H2B on the C-terminal peptide AVT'“*KFTSS 
might be ubiquitinated. Trypsin cuts after the C-terminal arginine and thus 
removes all except two glycine residues attached to the previously ubiquitinated 
lysine residue, resulting in an increment of 114 Da to the peptide mass, which can 
be used by mass spectrometry for diagnosis of the ubiquitination site. We there- 
fore analysed the MS/MS spectrum of the precursor ion manually at m/z 477.7, 
which was doubly charged and was 57 mass units heavier than the H2B 
C-terminal peptide AVTKFTSS. Observed amino-terminal fragmentation ions, 
namely ‘a’ and ‘b’ ions (a3, bz, b3, bs and by or their water-loss ions), and the 
C-terminal ions, namely ‘y’ ions (y2, y4, ye and y7 and their water-loss ions), 
matched well the peptide sequence AVT.ggGKFTSS with a GG chain attached to 
the lysine amino acid. A GG chain was observed linking with the lysine amino 
acid by the fact that a 114-Da mass increase was added to bg, b7, ye and yz ions, 
which cover the fragments including the lysine amino acid and a series of internal 
fragmentation ions and their water-loss ions (cleavage from both N-terminal 
and C-terminal directions) corresponding to TKgq; KecF, TKecF and TKgcFT. 
All of these had a 114-Da mass increase from their counterparts in which the 
lysine amino acid was not modified. H2B ubiquitination at Lys 143 was therefore 
unambiguously determined by MS. 

Relative quantification of H2B ubiquitination was performed by LC-MS ana- 
lysis of the Arg_C and V8 dual-enzyme digestion of H2B protein. The percentage 
ubiquitination was calculated as the relative peak intensity between the ion at 
m/z 620.8, corresponding to the dual-enzyme-cleaved ubiquitinated peptide 
GTKAVTgc¢KFTSS, and the ion at m/z 563.8, corresponding to the unmodified 
peptide GIKAVTKFTSS. 
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The generation game 


Proteomics is hungry for well-validated antibodies. Nathan Blow looks at the options and sees how 
researchers are redefining the way to generate an antibody. 


The products of more than 22,700 
genes make up the human pro- 
teome. But researchers hoping 
to unpick the mysteries of the 
proteome are restricted by the 
fact that antibodies against only a 
small percentage of these proteins 
are available. Although commer- 
cial production of antibodies is 
well established, the market has 
so far been driven by the popu- 
larity of particular antibody tar- 
gets. More than 1,000 antibodies 
against the tumour suppressor p53 
are available, for example, but the 
less sought-after targets frequently 
have none. 

With the rise of proteomics, the 
need for antibodies has become 
global and is no longer limited to a 
small number of targets central to 


Image of ovarian tissue stained using a commercial antibody. 


offers researchers a rare level of 
confidence in the antibodies 
that pass the test. 

Currently, the atlas consists 
of more than 1,500 antibodies, 
half of which have been pro- 
vided by commercial sources. 
In October 2007, a new release 
will be made available bringing 
the total number of antibod- 
ies to slightly less than 3,000. 
Confocal microscopy images 
of tissues stained with the anti- 
bodies will also be released in 
October to complement immu- 
nohistochemistry images. “We 
are committed to showing the 
public all the primary valida- 
tion of the antibodies,” says 
Uhlén. Although polyclonal 
antibodies are not renewable, 


most hypothesis-driven research 

projects. “One of the hurdles in proteomics is 
a lack of high-quality, well characterized affin- 
ity reagents,’ says Henry Rodriguez, director of 
the clinical proteomic technologies initiative 
for cancer at the US National Cancer Institute 
(NCI) in Bethesda, Maryland. 

Not only do researchers need access to more 
antibodies, they also need to know how these 
antibodies have been characterized to deter- 
mine whether they will work in the assay they 
are using. “Large numbers of antibodies are 
already available, but an investigator has to 
navigate through a complex system to find 
out which target antibodies are going to be 
appropriate for his or her particular assays,” 
says Adam Clark, who works on the NCI’s 
proteomic technologies initiative. 


Standard issue 

At present, there is no universal validation: 
an antibody that works wonders for a West- 
ern blot may perform poorly in immuno- 
histochemistry. This growing need for faster 
antibody production and stronger validation 
data is leading many groups to explore high- 
throughput methodologies for creating and 
validating affinity reagents. 

A group led by Mathias Uhlén at the Royal 
Institute of Technology in Stockholm, Sweden, 
is spearheading an initiative to assemble the 
Human Protein Atlas (www.proteinatlas.org). 
The project aims to explore the entire human 
proteome using antibodies. Uhlén’s group is 
producing polyclonal antibodies directed 
against each human protein, and then char- 
acterizing the antibodies using Western 


blots, protein microarrays and immuno- 
histochemistry. 

The project generates around ten new 
polyclonal antibodies and more than 10,000 
immunohistochemistry images every day 
— an achievement that relies heavily on high- 
throughput methodology. Although most 
steps in the process are amena- 
ble to automation, some, such 
as annotating tissue immu- 
nohistochemistry images, are 
proving to be a significant 
challenge. Indeed, analysing 
these images still involves ten 
pathologists who have to anno- 
tate them manually. “All the 
annotation is Internet based. 
The pathologists view and 
evaluate 600 tissue images per 
antibody via a web-based tool 
on their personal computers, 
and the results are stored in our 
database,” says Uhlén. 

In addition to producing and 


Mathias Uhlén is using 
antibodies to unravel the 
human proteome. 


a limited number of aliquots 
of each antibody generated by the atlas will be 
made available through Atlas Antibodies of 
Stockholm. 

Other initiatives are focusing on mono- 
clonal, rather than polyclonal, antibodies 
against proteins on a large scale. In 2002, for 
example, the Wellcome Trust Sanger Institute 
near Cambridge, UK, launched 
the Atlas of Gene Expression. 
Although a change in focus at 
the institute means that this 
is closing down, the atlas was 
set up to generate high-qual- 
ity monoclonal antibodies that 
were well characterized for a 
variety of assays. Much as for 
the Human Protein Atlas, a 
well-standardized characteri- 
zation was key. “Historically, 
the validation of antibodies has 
been ad hoc with different peo- 
ple generating antibodies that 
have been assessed in different 
ways,” says project leader John 


testing their own antibodies, 

Uhlén and his team will put antibodies from 
commercial sources through their standard- 
ized quality-control pipeline. Uhlén says he 
was surprised that only about 35% of com- 
mercial antibodies seemed to work — although 
he notes that this could be a result of the way 
his group analyses them. “We decided to use a 
very standardized way of validating antibodies: 
if they don’t work, we don't try other ways of 
doing it,” he says. The success rate may be low, 
but such a standardized quality-control process 
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McCafferty. 

The project involved four key groups. One 
generated the proteins of interest and did qual- 
ity control. This team used the Gateway clon- 
ing system made by Invitrogen of Carlsbad, 
California, to move a variety of open reading 
frames between different expression vectors 
and so improve yields for troublesome pro- 
teins. A second group used phage display for 
high-throughput screening of a single-chain 
antibody library generated at the Sanger Insti- 
tute and containing more than 10'° phage 
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clones. The other two groups were dedicated 
to immunohistochemistry and the informatics 
infrastructure necessary to deal with the large 
volume of data. 

To deal with image acquisition issues, the 
institute collaborated with Applied Imaging of 
San Jose, California, to develop an automated 
high-throughput image-analysis system suit- 
able for tissue microarray applications. So far, 
the Sanger project has generated more than 
4,000 monoclonal antibodies to 290 antigens, 
which are available to buy from Geneservice in 
Cambridge, UK. 

Although the project is being discontinued, 
McCafferty says that much has been learned 
about the bottlenecks of high-throughput 
generation of antibodies and how these can be 
overcome. “Surprisingly, the generation of the 
antibodies was not the major issue,” he says. 
“The bottlenecks were generating good qual- 
ity protein product to do selection, and how 
to deal with the large amounts of image data a 
project such as this produces.” 


Finding affinity 

Even as the Sanger project comes to a close, 
other initiatives are beginning to gather steam 
— although these have been hampered some- 
what by a lack of funding. “There seems to be 
a reluctance from the funding agencies to put 
money into large-scale antibody initiatives,” 
says Andrew Bradbury of the biosciences 
division at Los Alamos National Laboratory 
in New Mexico. The NCT’s five-year, $104-mil- 
lion clinical proteomic technologies initiative 


Cells stained with the 4G10 anti-phosphotyrosine 
antibody from Millipore. 


that is now getting off the ground may be the 
start of a change. 

In 2005, the NCI held a workshop to dis- 
cuss affinity capture. It found that the scien- 
tific community wanted renewable resources 
that were well characterized for performance 
data, says Clark. The meeting also revealed 
that the community was concerned by the 
lack of characterization data for most avail- 
able antibodies. 

Following this lead, the NCI proteomics rea- 
gent core, one of the centres in the clinical pro- 
teomic technology initiative, is embarking on 
the production of affinity reagents. To focus its 
efforts, it has identified a list of protein targets: 
all cancer-related proteins for which no com- 
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mercial antibodies are yet available. The core 
will develop monoclonal antibodies that will be 
characterized by Western blots, enzyme-linked = 
immunosorbent assay (ELISA), immunohisto- 
chemistry and immunoprecipitation followed 
by mass spectrometry. “All the raw data on how 
the antibodies perform on a variety of assays 
will be provided and an investigator will be able 
to acquire these antibodies through a website 
organized by the NCI,’ says Clark. 

In Europe, another group of investigators 
plans to generate affinity reagents against 
the human proteome. The group, called Pro- 
teomeBinders, consists of 26 European Union 
and two US institutional partners. “The goal 
or the hope is to get funding from the Euro- 
pean Union to puta project together next year 
or the year after,’ says Bradbury, one the US 
participants. 

Although the antibody remains the affinity 
reagent of choice, the exploration of alterna- 
tive binders by large groups, such as Proteome- 
Binders, shows how far these non-traditional 
reagents have come in a relatively short time. 


ILLIPO 


Gold standard 
A quick glance through the catalogues from 
commercial vendors and researchers reveals 
thousands of antibodies not only to proteins, 
but also to specific protein changes such as 
post-translational modifications. Still other 
companies offer to produce antibodies to an 
investigator's antigen of interest. 

Monoclonal antibodies produced by animal 
immunization remain the ‘gold standard’ of 


ANTIBODIES IN THE FAST LANE 


In making recombinant antibodies, 
the resulting antibody is only 

as good as the combinatorial 
library and the screening assay. 
The trick is to find the molecule 

of highest affinity and specificity 
for the target among a library of 
millions of clones. Traditionally, 
recombinant antibody libraries 
have been phage-based and the 
screening relied on enzyme-linked 
immunosorbent assay (ELISA), 
nota high-throughput method. 
Changes in both phage display and 
screening methods are now moving 
recombinant antibody production 
into a high-throughput world. 

Flow cytometry has become the 
assay of choice for rapid screening 
of clones from recombinant 
antibody libraries. “We looked at 
different ways of screening. Flow 
cytometry was the only one that 
seemed to meet our throughput 
requirements,” says Andrew 
Bradbury of Los Alamos National 
Laboratory in New Mexico. 


Bradbury's group has developed a 
flow-cytometry assay to screen its 
single-chain antibody-fragment 
phage libraries using a mixture 
of beads coated with specific and 
non-specific antigens. The method 
rapidly identifies antibodies that 
have good affinity for the protein 
of interest while discarding those 
that show low specificity. 
Phage-display screening 
methods are also used to identify 
antibodies that target post- 
translational modifications 
(PTMs) such as phosphorylation 
or acetylation. As PTMs havea 
role in many processes — from 
gene regulation to apoptosis — 
they are of growing interest for the 
biological community. Companies 
have responded by developing 
antibodies targeting proteins in 
a specific state of modifications. 
“Antibodies to PTMs are gaining in 
importance with customers,” says 
Kumar Bala, director of antibody 
technologies for Millipore in 


the company's lab in Temecula, 
California. 

But obtaining antibodies directed 
against PTMs is not a trivial task. 
Rockland Immunochemicals 
of Gilbertsville, Pennsylvania, 
has put ina lot of effort to 
develop antibodies for looking 
at phosphorylated and non- 
phosphorylated forms of various 
proteins in a sequence independent 
context, says Daniel O'Shannessy, 


Andrew Bradbury develops 
recombinant single-chain antibodies. 
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the company's vice-president of 
corporate development. 

Antibodies that recognize PTMs 
independently of the protein 
site on which the modification 
occurs are useful — particularly 
for enriching, for example, all 
phosphorylated proteins from 
a cell. But such antibodies 
are hard to make by animal 
immunization as the PTM itself is 
not immunogenic. So scientists are 
turning to recombinant molecules 
and in vitro screening such as 
phage display to isolate ‘pan-PTM’ 
affinity reagents. 

Although making steps 
in the right direction, many 
more antibodies and further 
improvements in affinity reagent 
technology will be needed to 
understand and characterize the 
full range of PTMs found in nature. 
Still, Bala argues that the “best 
tools for purifying, identifying, 
differentiating and characterizing 
PTMs are antibodies”. N.B. 
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affinity reagents. They are relatively renew- 
able, can usually be made with high specificity 
and affinity for their target and can be used in 
common biochemical assays such as Western 
blotting, ELISA andimmunochemistry. But the 
traditional monoclonal antibody has its draw- 
backs. Its production can be challenging, time- 
consuming and costly. So there is a lot of interest 
in identifying novel affinity reagents that would 
be less expensive and quicker to produce. “The 
future is with alternative binders,” says Brad- 
bury, who works with single-chain antibodies 
(see ‘Antibodies in the fast lane; page 743). 


Optimistic expression 
The structural characteristics of antibod- 
ies make it difficult to produce recombinant 
versions in bacteria and restricts their use in 
some high-throughput screening methods. 
But George Georgiou and his colleagues at 
the University of Texas at Austin have come 
up with a method to produce and screen full- 
length immunoglobulin G (IgG) antibodies 
expressed in Escherichia coli (Y. Mazor et al. 
Nature Biotechnol. 25, 563-565; 2007). 

The technology produces full-length anti- 
bodies that are initially tethered to the inner 
membrane of the bacterium. When the bac- 
teria are treated with EDTA and lysozyme, the 
resulting spheroplasts with exposed antibodies 
can be selected in a high-throughput manner 
by using fluorescently labelled antigens and 
flow cytometry. 

“We can isolate several bacterial clones 
expressing full-length IgG antibodies that can 
bind to the antigen with the requisite affinity; 
says Georgiou. “The advantage of the technol- 
ogy is that the expression of the antibodies is 
directly in bacteria and we can use the bacteria 
to produce the antibody without going through 
the steps of reformatting the antibody and then 
expressing in a mammalian system.” But 
the antibodies are not glycosylated, 
which limits some of the therapeu- 
tic applications. 

The more traditional way to 
screen and obtain IgG antibod- 
ies rapidly is the generation of 
recombinant antibody frag- 
ments. Single-chain variable 
(scFv) antibody fragments are 
created by the fusion of the var- 
iable regions of the heavy and 
light chains of immunoglobu- 
lins using a short peptide linker. 
This allows scFv fragments to 
be expressed from a single open 
reading frame and screened by 
phage display or other high- 
throughput approaches. 


Although larger than scFv _ The structure of an affibody 
affinity reagent. 


fragments and still composed 


Flow cytometry is the assay of choice for screening recombinant-antibody libraries. 


region is the antigen-binding region of the 
immunoglobulin. Fab fragments consist of one 
constant and one variable domain from each of 
the heavy and light chains. 

Both scFv- and Fab-based technologies are 
developing rapidly, with several companies 
now supplying either scFv or Fab libraries 
and screening systems to consumers. One 
such company is BioInvent of Lund, Sweden, 
which provides the n-CoDeR human-anti- 
body library based on both the scFv and Fab 
formats. Other companies, including Cam- 
bridge Antibody Technology in Cambridge, 
UK, and MorphoSys in Martinsried, Germany, 
have developed human-derived phage-display 
libraries using either the scFv or Fab format 
to identify binding regions for development of 
therapeutic monoclonal antibodies. 

Although rapidly generated and effective 
for many in vitro applications, scFv and Fab 
fragments are less effective for therapeutic 

applications because they have short 
half-lives. “Single-chain fragments 
can’t really be used in animals 
because they are cleared very 
rapidly — the half-life of a single 
chain Sv is about 10 minutes, 
whereas full-length antibod- 
ies can persist for several days,’ 
says Georgiou. 


Breaking with tradition 
Described for the first time 
in 1997 by Uhlén and his col- 

leagues, affibodies were among 
the first non-immunoglobulin- 
based affinity reagents. These 
small molecules are based on a 
bacterial receptor (Staphylococcus 
aureus protein A), and use com- 
binatorial protein engineering to 
introduce random mutations in 


of two independent polypep- 

tide chains, Fab antibody fragments are also 
being used and are usually favoured for their 
high stability and compatibility with exist- 
ing antibody-based assays. The Fab antibody 
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the affinity region. Affibody of 
Bromma in Sweden, which was co-founded 
by Uhlén, currently produces affibody-based 
reagents for basic research laboratories and 
commercial partners. 
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Another non-immunoglobulin-based 
affinity reagent that is becoming more widely 
used is the aptamer. Made of DNA, RNA or 
modified nucleic acids and typically 15-40 
bases in length, aptamers have a stable tertiary 
structure that permits protein binding through 
van der Waals forces, hydrogen bonding 
and electrostatic interactions. Early studies 
showed that aptamers can be highly specific 
for target proteins, with the ability to distin- 
guish between related members of a protein 
family (S. D. Seiwart et al. Chem. Biol. 7, 833- 
843; 2000). 

Unlike the scFv and Fab fragments, both 
aptamers and affibodies are useful for in vivo 
applications because they have longer half- 
lives. In addition, both function well in the 
reducing environment of the cell cytosol, which 
is a problem for larger monoclonal antibodies. 
Currently, Affibody is testing an HER2-bind- 
ing affibody as an alternative to herceptin for 
treatment of HER-2-positive breast cancer with 
a clinical proof- of-principle microdosing study 
to occur this year. 

Archemix in Cambridge, Massachusetts, 
has three aptamer-based therapeutics in phase 
I clinical trials. Two aptamers target coagula- 
tion processes and the third targets nucleolin, 
a protein that is involved in the development 
of some cancers. 

Overall, there is much optimism regarding 
the future of alternative affinity reagents. But 
several problems have to be overcome before 
they are adopted more widely by the scientific 
community. One of the most pressing issues is 
the inability to produce these reagents at a truly 
high-throughput scale. Overcoming this obsta- 
cle would make these alternative binders not 
only cheaper to produce than traditional anti- 
bodies, but would also require significantly less 
time. “I would love to see a major technology 
breakthrough where someone shows that you 
can actually produce these in a high-through- 
put manner, but so far I don’t think anyone has 
been able to do that,” says Uhlén. a 
Nathan Blow is the technology editor for 
Nature and Nature Methods. 


ACCURI CYTOMETERS 


clones. The other two groups were dedicated 
to immunohistochemistry and the informatics 
infrastructure necessary to deal with the large 
volume of data. 

To deal with image acquisition issues, the 
institute collaborated with Applied Imaging of 
San Jose, California, to develop an automated 
high-throughput image-analysis system suit- 
able for tissue microarray applications. So far, 
the Sanger project has generated more than 
4,000 monoclonal antibodies to 290 antigens, 
which are available to buy from Geneservice in 
Cambridge, UK. 

Although the project is being discontinued, 
McCafferty says that much has been learned 
about the bottlenecks of high-throughput 
generation of antibodies and how these can be 
overcome. “Surprisingly, the generation of the 
antibodies was not the major issue,” he says. 
“The bottlenecks were generating good qual- 
ity protein product to do selection, and how 
to deal with the large amounts of image data a 
project such as this produces.” 


Finding affinity 

Even as the Sanger project comes to a close, 
other initiatives are beginning to gather steam 
— although these have been hampered some- 
what by a lack of funding. “There seems to be 
a reluctance from the funding agencies to put 
money into large-scale antibody initiatives,” 
says Andrew Bradbury of the biosciences 
division at Los Alamos National Laboratory 
in New Mexico. The NCT’s five-year, $104-mil- 
lion clinical proteomic technologies initiative 


Cells stained with the 4G10 anti-phosphotyrosine 
antibody from Millipore. 


that is now getting off the ground may be the 
start of a change. 

In 2005, the NCI held a workshop to dis- 
cuss affinity capture. It found that the scien- 
tific community wanted renewable resources 
that were well characterized for performance 
data, says Clark. The meeting also revealed 
that the community was concerned by the 
lack of characterization data for most avail- 
able antibodies. 

Following this lead, the NCI proteomics rea- 
gent core, one of the centres in the clinical pro- 
teomic technology initiative, is embarking on 
the production of affinity reagents. To focus its 
efforts, it has identified a list of protein targets: 
all cancer-related proteins for which no com- 
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mercial antibodies are yet available. The core 
will develop monoclonal antibodies that will be 
characterized by Western blots, enzyme-linked = 
immunosorbent assay (ELISA), immunohisto- 
chemistry and immunoprecipitation followed 
by mass spectrometry. “All the raw data on how 
the antibodies perform on a variety of assays 
will be provided and an investigator will be able 
to acquire these antibodies through a website 
organized by the NCI,’ says Clark. 

In Europe, another group of investigators 
plans to generate affinity reagents against 
the human proteome. The group, called Pro- 
teomeBinders, consists of 26 European Union 
and two US institutional partners. “The goal 
or the hope is to get funding from the Euro- 
pean Union to puta project together next year 
or the year after,’ says Bradbury, one the US 
participants. 

Although the antibody remains the affinity 
reagent of choice, the exploration of alterna- 
tive binders by large groups, such as Proteome- 
Binders, shows how far these non-traditional 
reagents have come in a relatively short time. 


ILLIPO 


Gold standard 
A quick glance through the catalogues from 
commercial vendors and researchers reveals 
thousands of antibodies not only to proteins, 
but also to specific protein changes such as 
post-translational modifications. Still other 
companies offer to produce antibodies to an 
investigator's antigen of interest. 

Monoclonal antibodies produced by animal 
immunization remain the ‘gold standard’ of 


ANTIBODIES IN THE FAST LANE 


In making recombinant antibodies, 
the resulting antibody is only 

as good as the combinatorial 
library and the screening assay. 
The trick is to find the molecule 

of highest affinity and specificity 
for the target among a library of 
millions of clones. Traditionally, 
recombinant antibody libraries 
have been phage-based and the 
screening relied on enzyme-linked 
immunosorbent assay (ELISA), 
nota high-throughput method. 
Changes in both phage display and 
screening methods are now moving 
recombinant antibody production 
into a high-throughput world. 

Flow cytometry has become the 
assay of choice for rapid screening 
of clones from recombinant 
antibody libraries. “We looked at 
different ways of screening. Flow 
cytometry was the only one that 
seemed to meet our throughput 
requirements,” says Andrew 
Bradbury of Los Alamos National 
Laboratory in New Mexico. 


Bradbury's group has developed a 
flow-cytometry assay to screen its 
single-chain antibody-fragment 
phage libraries using a mixture 
of beads coated with specific and 
non-specific antigens. The method 
rapidly identifies antibodies that 
have good affinity for the protein 
of interest while discarding those 
that show low specificity. 
Phage-display screening 
methods are also used to identify 
antibodies that target post- 
translational modifications 
(PTMs) such as phosphorylation 
or acetylation. As PTMs havea 
role in many processes — from 
gene regulation to apoptosis — 
they are of growing interest for the 
biological community. Companies 
have responded by developing 
antibodies targeting proteins in 
a specific state of modifications. 
“Antibodies to PTMs are gaining in 
importance with customers,” says 
Kumar Bala, director of antibody 
technologies for Millipore in 


the company's lab in Temecula, 
California. 

But obtaining antibodies directed 
against PTMs is not a trivial task. 
Rockland Immunochemicals 
of Gilbertsville, Pennsylvania, 
has put ina lot of effort to 
develop antibodies for looking 
at phosphorylated and non- 
phosphorylated forms of various 
proteins in a sequence independent 
context, says Daniel O'Shannessy, 


Andrew Bradbury develops 
recombinant single-chain antibodies. 
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the company's vice-president of 
corporate development. 

Antibodies that recognize PTMs 
independently of the protein 
site on which the modification 
occurs are useful — particularly 
for enriching, for example, all 
phosphorylated proteins from 
a cell. But such antibodies 
are hard to make by animal 
immunization as the PTM itself is 
not immunogenic. So scientists are 
turning to recombinant molecules 
and in vitro screening such as 
phage display to isolate ‘pan-PTM’ 
affinity reagents. 

Although making steps 
in the right direction, many 
more antibodies and further 
improvements in affinity reagent 
technology will be needed to 
understand and characterize the 
full range of PTMs found in nature. 
Still, Bala argues that the “best 
tools for purifying, identifying, 
differentiating and characterizing 
PTMs are antibodies”. N.B. 
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COMPANY PRODUCTS/ACTIVITY LOCATION URL 
Antibody manufacturers, suppliers and services 
21st Century Biochemicals Custom monoclonal and polyclonal antibodies Marlboro, Massachusetts www.21stcenturybio.com 
Abcam Antibodies Cambridge, UK www.abcam.com 
Abnova Catalogue monoclonal antibodies Taipei City, Taiwan www.abnova.com.tw 
Affinity Labeling Technologies Nucleotide-based photoaffinity reagents for antibody labelling Lexington, Kentucky www.altcorp.com 
AnaSpec Catalogue and custom monoclonal and polyclonal antibodies San Jose, California Wwww.anaspec.com 
Antibodies Incorporated Custom and secondary monoclonal and polyclonal antibodies and reagents Davis, California www.antibodiesinc.com 
Atlas Antibodies Antibodies and affinity reagents Stockholm, Sweden www.atlasantibodies.com e 
Aves Labs Polyclonal chicken antibody services Tigard, California www.aveslab.com 
BD Biosciences Catalogue and custom antibodies San Diego, California www.bdbiosciences.com 
Beckman Coulter Catalogue monoclonal and polyclonal antibodies Fullerton, California www.beckmancoulter.com 
BioGenes Custom monoclonal and polyclonal antibody production Berlin, Germany www.biogenes.de e 
BioLegend Catalogue antibodies San Diego, California www.biolegend.com 
Biomeda ELISA kits, monoclonal and polyclonal antibodies Foster City, California biomeda.com 
Biotrend Catalogue antibodies, custom antibody services Cologne, Germany www.biotrend.com 
Boston Biochem Antibodies targeting ubiquitin proteosome pathway Cambridge, Massachusetts www.bostonbiochem.com 
Cambridge Research Biochemicals Custom peptide synthesis, polyclonal antibody production Billingham, UK www.crbdiscovery.com 
Cayman Chemical Catalogue antibodies Ann Arbor, Michigan www.caymanchem.com 
Cell Signaling Technology Antibodies for cell signalling through pathways such as MAP kinase, AKT and PKC Danvers, Massachusetts www.cellsignal.com 
Cytomyx Antibodies for kinases and ion channels Cambridge, UK www.cytomyx.com 
Delta Biolabs Catalogue antibodies to wide range of cell-signalling proteins Campbell, California www.deltabiolabs.com 
Diaclone Catalogue monoclonal antibodies; ELISA and Elispot Stamford, Connecticut www.diaclone.com 
Dragonfly Sciences Custom monoclonal and polyclonal antibody production Wellesley, Massachusetts www.dragonflysciences.com 
eBioscience Catalogue cytokine antibodies San Diego, California www.ebioscience.com 
ECM Biosciences Antibodies to post-translational modifications in proteins and peptides Versailles, Kentucky www.ecmbiosciences.com 
EMD Biosciences Catalogue antibodies San Diego, California www.emdbiosciences.com 
Fusion Antibodies Custom monoclonal or polyclonal antibodies, scFv production Belfast, UK www.fusionantibodies.com 
Geneservice Antibodies Cambridge, UK www.geneservice.co.uk 
Genovac Custom antibody services Freiburg, Germany www.genovac.com 
GenScript Antibody purification resins, custom peptide synthesis Piscataway, New Jersey www.genscript.com 
Harlan Custom antibody services, peptide synthesis ndianapolis, Indiana www.harlan.com 
Invitrogen Antibodies, fluorescently tagged antibodies Carlsbad, California www.invitrogen.com 
Lake Placid Biologicals Antibodies for studying chromatin Lake Placid, New York www.lpbio.com 
Lonza Biologics Contract manufacturing of therapeutic antibodies Portsmouth, New Hampshire www.lonzabiologics.com e 
Mabtech Catalogue antibodies suitable for cytokine detection in ELISpot, ELISA and intracellular staining Nacka Strand, Sweden www.mabtech.com 
Millipore Catalogue antibodies Billerica, Massachusetts www.millipore.com e 
Molecular Diagnostic Services Custom monoclonal or polyclonal antibody production San Diego, California www.mds-usa.com 
MorphoSys Phage-based antibody library screening services, therapeutic antibody discovery artinsried, Germany www.morphosys.com 
New England Biolabs Antibodies pswich, Massachusetts www.neb.com e 
Novus Biologicals Custom and catalogue antibodies Littleton, Colorado www.novus-biologicals.com 
Open Biosystems Custom antibody services Huntsville, Alabama www.openbiosystems.com 
OriGene Catalogue antibodies Rockville, Maryland www.origene.com e 
PeproTech Anticytokine antibodies Rocky Hill, New Jersey www.peprotech.com 
ProSci Catalogue antibodies, custom monoclonal and polyclonal antibodies Poway, California www.prosci-inc.com 
QED Bioscience Custom monoclonal and polyclonal antibody production, genetic immunization,cell-culture San Diego, California www.gedbio.com 
services 
Rockland Immunochemicals Catalogue antibodies; primary, secondary and fluorescently tagged antibodies Gilbertsville, Pennsylvania www.rockland-inc.com e 
Sigma-Aldrich Catalogue monoclonal and polyclonal antibodies; chemicals and reagents St Louis, Missouri www.sigmaaldrich.com e 
Spring Valley Laboratories Custom monoclonal and polyclonal antibody production Woodbine, Maryland www.sviab.com 
Zyomyx Human cytokine profiling system Hayward, California www.zyomyx.com 
Antibody-based drug discovery and development 
Abbott Monoclonal-antibody-based drug discovery and development Abbott Park, Illinois www.abbott.com 
AbD Serotec (MorphoSys) Phage-display libraries, antibody drug development Martinsried, Germany www.ab-direct.com e 
Ablynx Nanobodies development, drug discovery based on nanobodies Ghent, Belgium www.ablynx.com 
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COMPANY PRODUCTS/ACTIVITY LOCATION URL 

Affibody Development of Affibodies as alternative binders, Affibody-based drug discovery Bromma, Sweden www.affibody.com 
Affimed Therapeutics Phage-display human antibody libraries Heidelberg, Germany www.affimed.com 
Amgen Monoclonal-antibody-based therapeutics, Enbrel Thousand Oaks, California www.amgen.com 
Archemix Aptamer-based drug discovery and development Cambridge, Massachusetts www.archemix.com 
Biolnvent Developed n-CoDeR phage libraries, drug discovery and development Lund, Sweden www.bioinvent.com 
Biotecnol Monoclonal-antibody-based therapeutics focused on cancer Baltimore, Maryland www.biotecnol.com 


Cambridge Antibody Technology 


Monoclonal-antibody therapeutics 


Cambridge, UK 


www.cambridgeantibody.com 


Centocor Monoclonal-antibody-based therapeutics targeting immune-mediated inflammatory disorders Malvern, Pennsylvania www.centocor.com 

Dyax Phage-display Fab libraries, drug discovery Cambridge, Massachusetts www.dyax.com 

Ortho Biotech Antibody-based drug discovery, Orthoclone OTK3 monoclonal antibody Bridgewater, New Jersey www.orthobiotech.com 
PDL BioPharma Antibody-based drug discovery Mountain View, California www.pdi.com 

Regeneron Development of hybrid antibody molecules as therapeutics Tarrytown, New York www.regeneron.com 
XOMA Antibody-based therapeutics, Raptiva monoclonal-antibody therapy Berkeley, California www.xoma.com 
Immunoassay and immuohistochemistry 

Accuri Cytometers Flow cytometry instrumentation Ann Arbor, Michigan www.accuricytometers.com 
Active Motif ELISA kits for transcription-factor assays, antibodies Carlsbad, California www.activemotif.com 
Alpha Diagnostic Custom peptides and polyclonal antibodies, kits for ELISA San Antonio, Texas www.4adi.com 


Applied Imaging 
Amersham Biosciences 
Assay Designs 

Beecher Instruments 
Bender MedSystems 
Bio-Rad 

Dako 

Diagnostic Biosystems 
GenBio 

Panomics 

Perkin Elmer Life Sciences 
Pierce Biotechnology 
R&D Systems 

Thermo Fisher Scientific 


Vision Biosystems 


Developed Ariol high-throughput immunohistochemistry system 
Reagents for immunochemistry assays 


lImmunoassays (ELISA), antibodies 


Tissue microarray instruments and reagents 
Flow cytometry systems, ELISA systems and reagents 
Bio-Plex multiplex antibody assays 


Flow cytometry kits and reagents, antibodies 


Monoclonal and polyclonal antibodies, kits and stains for immunohistochemistry 
Assays for antibodies characteristic of Lyme disease 

Luminex- and ELISA-based assays 
Automated immunoassays 
Antibody production and purification, immunoassays 


Immunoassay kits and reagents, antibodies, ELISA, flow-cytometry kits 


Antibodies and immunoassay reagents 


Automated immunohistochemistry systems and reagents 


San Jose, California 
Uppsala, Sweden 

Ann Arbor, Michigan 

Sun Prairie, Wisconsin 
Vienna, Austria 

Hercules, California 
Carpinteria, California 
Pleasanton, California 
San Diego, California 
Redwood City, California 
Waltham, Massachusetts 
Rockford, Illinois 
Minneapolis, Minnesota 
Waltham, Massachusetts 


Mount Waverely, Australia 


www.aicorp.com 
www.amersham.com 
www.assaydesigns.com 
www.beecherinstruments.com 
www.bendermedsystems.com 
www.bio-rad.com 
www.dakousa.com 
www.dbiosys.com 
www.genbio.com 
www.panomics.com 
las.perkinelmer.com 
www.piercenet.com 
www.rndsystems.com 
www.thermofisher.com 


www.vision-bio.com 


General 

BioFX Laboratories 

Cambio 

Carl Zeiss 

Integra Biosciences 

MP Biomedicals 

New Brunswick Scientific 
Nikon Instruments 

Peptide Specialty Laboratories 
USB 


@ see advertisement 
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Secondary antibodies, immunohistochemistry reagents and substrates 
Molecular biology reagents, specialized biochemicals 

Imaging systems 

Cell cultivation systems 

Reagents and chemicals for research 

Cell cultivation instruments 

Imaging systems 

Custom peptide synthesis 


Chemicals and reagents for molecular biology 
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Owings Mills, Maryland 
Cambridge, UK 

Jena, Germany 

Chur, Switzerland 
Aurora, Ohio 

Edison, New Jersey 
Melville, New York 
Heidelberg, Germany 
Cleveland, Ohio 


www.biofx.com 
www.cambio.co.uk 
www.zeiss.com 
www.integra-biosciences.com 
www.mpbio.com 
www.nbsc.com 
www.nikoninstruments.com 
www.peptid.de 


www.usbweb.com 
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naturejobs 


y the time you've read this, | will have effectively traded Nature for nature 

— my wife and | will be about 50 miles into our 1,000-mile hike on the 

Appalachian trail from West Virginia to Maine (http://1000milesummer. 

blogspot.com). So before | lace my boots and hoist my backpack, here's 
some final thoughts I've gained from this editorial adventure. 

Career paths aren't linear. To follow these paths, planning and formal training help, 
but it's perhaps more important to build skills that will help you match your interests 
with opportunities. The boundaries between industry, academia and government 
are blurring. To negotiate them, consider what skills you have and how they can be 
transferred from sector to sector. Don't want to choose one sector over another? 
You don't have to. You can work in each simultaneously. Aren't comfortable going 
from sector to sector or from on to off the bench? It’s OK. You can get as much or as 
little formal training as you like before you make your next move. Some people prefer 
to get additional degrees whereas others opt for more informal education. 

It's important to think globally. New hotbeds of scientific research — from 
Singapore to St Louis — are springing up worldwide, and researchers are 
collaborating in increasingly large groups. Scientists are also more mobile than they 
used to be, moving from one continent to another to follow the right opportunity, 
the right project or to pursue the best chance for funding and research freedom. 

In any case, | advocate working outside your comfort zone. It's frightening but 


exhilarating to move into unfamiliar areas. Perhaps that’s why I've decided to 
make the move, quite literally, on to a different path. Where will this walk lead? | 
suspect that it will take me to a place where | will divide my time into thirds: teaching, 
journalism and creative projects. Just as | don’t expect to find our hike easy and 
comfortable, | don’t anticipate that my career change will be without challenge. 
Anyway, before | slip into my sleeping bag and make an invocation for a bear-free 


evening, I'd like to thank all Naturejobs readers for your support and feedback. | wish 


everyone a successful career adventure. Happy trails! 
Paul Smaglik, Naturejobs editor 
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The hard cell 


Ethical quandaries aside, stem-cell science is attracting 
researchers worldwide. Ricki Lewis reports. 


espite disagreements about the ethics of 

embryonic stem-cell science, research in 

the field is thriving globally. At least 500 

companies and collaborations have sprung 
up, 100 of them in the past year alone, according to 
industry watchers. And although therapies are still 
largely at the preclinical and safety-testing stages, 
coaxing these cells to recapitulate development in vitro 
is already unveiling the beginnings of pathogenesis, 
revealing new drug targets. Basic and translational 
research positions are proliferating, drawing on a pool 
of applicants with diverse biomedical backgrounds. 

“Stem-cell research has brought us to the threshold 
of an entire new era in biomedicine,’ says Sally Temple, 
a professor in the centre for neuropharmacology and 
neuroscience at Albany Medical College in New York. 
The cells are the centrepiece of regenerative medicine, 
either alone or combined with semi-synthetic scaffolds 
in tissue engineering. And job opportunities are likely 
to expand if opposition to using cells from human 
embryos dissipates. It’s a distinct possibility, as ageing 
populations seek treatments for degenerative diseases, 
private funding is on the rise and public support grows. 

In short, stem-cell science is already exploding, with 
discoveries of more niches for the cells and innovative 
approaches to using them in illness and injury. “It’s rare 
that an advance in science could have such an impact 
on society,” says John Gearhart, director of the stem- 
cell programme at Johns Hopkins Medicine in 
Baltimore, Maryland. Runaway expectations are a risk, 
though. “It is having a big effect even before much has 
been demonstrated, and that is part of the problem” 

So far, a bigger problem has been the moral 
objections and political wavering. Australia last year 
eased restrictions, whereas Germany is going the other 
way. Research in the United States has had an uphill 
struggle — currently, only the few lines available before 
a 2001 ban on new ones can be used in federally funded 


work — although some states such as California and Stem-cell supporters: 
Wisconsin are strong backers. However, most moral Susan Solomon (top) and 
objections are to the use of embryos, and human John McCulloch. 
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embryonic stem cells are only part of the story. 

“They are a small niche of the stem-cell therapy 
world,’ says Darin Weber, a senior consultant with 
Biologics Consulting Group in Alexandria, Virginia. 
Non-human animals continue to be valuable sources of 
stem cells for research, he says. Also, the human body 
harbours many variations on the theme. Adult stem 
cells can also be used, as can those from the placenta 
and umbilical cord. Some companies use interesting 
semantic twists to cover themselves: EndGenitor 
Technologies of Indianapolis touts its work with “adult 
tissues that are free from any ethical controversy”. 


Finding your niche 

With so many companies springing up, niches are 
being defined and providing a variety of career choices. 
A common focus is bone-marrow-derived stem cells, 
because they both readily generate progeny of several 
distinct cell types and have been studied for half a 
century. Some companies offer intriguing spins on the 
bone-marrow theme. Cellerant Therapeutics of San 
Carlos, California, for example, is developing myeloid 
progenitor cells to restore bone marrow damaged in 
combat or from radiation. 

Other companies are coaxing stem cells from 
adipose tissue to yield cartilage, neurons, bone or 
muscle tissue; fashioning stem cells from teeth into 
personalized dental implants; and even growing 
replacement heart tissue, skin and bladders. Vet-Stem 
of Poway, California, treats horses with tendons and 
ligaments nurtured from adipose-tissue stem cells, and 
Vet Biotechnology of North Adelaide, Australia, is 
freezing cord blood cells from valued foals. 

Yet another tier of companies provides consumables 
such as reagents, culture media and instrumentation 
such as cell sorters and cryogenic apparatus. 

A solid background for a researcher includes a 
doctorate in molecular, cell or developmental biology, 
as well as skills to work with specific cell or tissue types 
acquired in at least two years of postdoc or industrial 
research. Degrees in pharmacology, immunology and 
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neuroscience are useful. Some companies hire at the 
bachelor’s and master’s level: the North East England 
Stem Cell Institute in Newcastle is one of several 
facilities to offer a master’s degree in stem-cell research. 
Positions may call for broad skills. For example, a 
post at PrimeCell Therapeutics in Irvine, California, 
needs expertise in gene-expression analysis and 
monitoring, establishing and expanding cell cultures 
and working with tissue scaffolding. Some positions 
seek experience with specific cell lineages. At Centocor, 
part of the Johnson & Johnson Stem Cell Internal 
Venture in Radnor, Pennsylvania, a retinal cell biologist 
will isolate and culture progenitor cells from the retina, 
establish degenerative-disease cell lines and develop 
assays for disease markers. Odontis, in London, seeks 
expertise in craniofacial development, and several 
companies seek skills in isolating pancreatic islet cells. 
As therapies edge closer to the clinic, other skills will 
be needed. Egg retrieval for nuclear-transfer experiments 
requires lab and clinical experience, according to Alison 
Murdoch, head of reproductive medicine at Newcastle 
University, UK. Investigators need the clinical skills to 
counsel patients, give medication and collect eggs. But 
they also need to understand psychological and social 
aspects of fertility treatment. “Experience in a clinical 
embryology laboratory is an advantage,’ she says. 


Companies and collaborations 

Stem-cell science is gestating quietly in small biotech 
companies and academic-clinical alliances. “The drug 
industry is watching the beginning of clinical trials and 
starting to posture themselves,” says Bryon Petersen, 
associate professor in the pathology, stem-cell biology 
and regenerative-medicine programme at the 
University of Florida in Gainesville. “They’re still on 
the sidelines, but there is no turning back.” 

The drug industry is waiting for research elsewhere 
to pass the phase I hurdle, where projects stop if they 
don’t show efficacy or they entered clinical trials too 
soon, says Weber. But, he adds, drug companies will 
need expertise in molecular and cell-culture techniques 
and the ability to translate basic research into a product. 

The downside of stem-cell science at a biotech firm is 
the directed focus. Most small companies are venture- 
capital funded, so there is pressure not to do experiments 
that may show only an incremental change. “A small 
biotech company is looking for bigger leaps in progress,” 
says Weber. Academic scientists, says Petersen, don't 
have to worry about a backer who wants a $100-million 


Cell mates: Dennis Steindler 
(left), Bryon Petersen 
(front) and their colleagues. 


Web links 

California Institute for 
Regenerative Medicine 

> www.cirm.ca.gov 

Stem Cell Network, Canada 
>} www.stemcellnetwork.ca 
EuroStemCell 

> www.eurostemcell.org 
International Society for 
Stem Cell Research 

> www. isscr.org 

London Regenerative 
Medicine Network 

>) www.regenmednetwork. 
com 

National Institutes of Health, 
Stem Cell Information 

} stemcells.nih.gov/ 
research/registry 

New York Stem Cell 
Foundation 

> www.nyscf.org 

Stem Cell Research 
Foundation 

> www.stemcellresearch 
foundation.org 
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return on a $40-million donation in four years. “I have 
the luxury of following the science — and you can 
never predict where that will take you,” he says. 
Collaborations are catalysing networking and 
resource-sharing. The London Regenerative Medicine 
Network, for example, has 2,500-plus members and 
holds monthly meetings with speakers; sponsors 
include GlaxoSmithKline and Thomson Pharma. In 
Canada, a translational development company, 
Aggregate Therapeutics in Ontario, represents 37 
leading principal investigators, pooling discoveries 
rather than working through individual technology 
transfer offices, says John McCulloch, adviser to the 
MaRS Venture Group in Toronto. The EuroStemCell 
Consortium provides cell lines and facilitates skills and 
knowledge-sharing. An investigator with an idea for an 
experiment but working in a country with restrictive 
policies, such as Germany or Italy, may be able to visit a 
lab with more freedom, such as in Norway or Britain. 


World of opportunity 

Meanwhile, Singapore continues to recruit top 
scientists as part of government and private efforts 

to boost biomedical research. Big names such as 

Neal Copeland and Nancy Jenkins — formerly of the 
US National Institutes of Health (NIH) — as well as 
early-career scientists are among those enjoying its 
ample funding and minimal research restrictions. The 
Australian Stem Cell Centre in Clayton, Victoria, a 
collaboration of academia and biotechs, is recruiting 
under new executive director Joseph Sambrook, a 
veteran of Cold Spring Harbor Laboratory in New York. 

In the United States, private and state support tops up 
federal funding, which totalled roughly $40 million in 
2006. It isn’t the NIH budget that is so restrictive, 
researchers say, but the need for meticulous separation of 
work using non-approved cells from any other 
experiments, down to each pipette: investigators must 
delineate ‘NIH-free’ zones. This policy led to the birth of 
the private New York Stem Cell Foundation (NYSCF) in 
2005, which funds human embryonic stem-cell research 
and provides limited lab space in the US northeast. “We 
started the foundation because we felt that it would be 
some time before the various funding, policy and 
controversies are solved, and in the meanwhile, 
scientists’ work was ready to be done,’ says co-founder 
Susan Solomon. In 2006 the foundation opened the first 
‘safe haven lab at an undisclosed location. It is 
unofficially called the ‘underground railroad; after the 
Civil War slaves’ escape route. The lab has developed 
patient-derived embryonic stem-cell lines for diabetes 
and Parkinson's disease. In addition to grants and annual 
meetings, the NYSCF sponsors three-year postdocs. 

The foundations of the field are still being laid, many 
say. “We will be first learning how to manipulate stem 
cells and get a better understanding of the limits and 
possibilities, both in the United States and globally,” says 
Gil Sambrano, scientific review officer for the California 
Institute for Regenerative Medicine. But the pace is 
accelerating. “Progress in our understanding of the 
basic biology of stem cells has been tremendous in the 
past decade,” says Dennis Steindler, executive director of 
the McKnight Brain Institute at the University of 
Florida. “We are certain to see these findings translate to 


treatments in the not-too-distant future.” r 
Ricki Lewis is a freelance science writer in 
Schenectady, New York. 
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The inside track from academia and industry 


Great expectations 


You may have got the job, but making sure it’s the right fit is important for both employer and employee. 


Joann Boughman 
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Landing a good job is a 
challenge, and the process can be 
daunting. Choosing a position 
and negotiating the details may 
not be easy, especially in an 
academic setting, but the first 
steps will set the tone for your 
future work relationships. 

If you have made it through 
the initial selection and interview 
stages, then you have already 
made yourself known, and your 
skills, training and experience 
have been recognized by those 
involved in the selection process. 
It is incumbent on you to find the 
best possible arrangement. 

You should research the 
institution and specific unit 
in which you will be working. 
Know exactly what they are 
looking for, going beyond the 
advertisement. Determine how 
your skills and experience fit 
their needs, but do not err by 
overselling your abilities. It is 
always best to be honest, but 
if there are new skills that you 
must master, show an eagerness 
to learn. 

Understanding the 
relationships and the ‘cast of 
characters’ at the job site is 
very important. Learn enough 
about the administrative 
structure to know which 
people have the authority to 
make decisions, including 
financial decisions. Although 
the principal investigator or 
laboratory director may be 
hiring, be aware that the division 
chief, programme director or 
department chair may have 
the ultimate authority for 
salary, space, or equipment 
and supply needs. 

Do enough background 
research to understand how 
the unit you may be joining fits 
into the larger context of the 
department, school, university 
or company. For example, 
is the unit a part of an entity 
that depends on extramural 
grant funds, or is there clear 
institutional support as well? 

If the position of interest would 
be funded only through grants, 


you should be aware of the 
potential consequences and 
timeframe in the event that 
funding is not renewed. 


Expectations 

Theirs It is important that you 
know what is expected of you 
in the new position. Will you 
be performing only research? 
Is teaching involved? If so, 
what is the time commitment 
and what types of student will 
you be responsible for? Are 
you expected to supply any 

of your salary in the form of 
grants and, if yes, how soon? 
Are there duties outside the lab 
setting or designated teaching 
load? What are the routine and 
style of the unit (work hours 
and environment, sharing of 
equipment or space, teamwork 
on any or all projects)? 

These questions are important 
not only for the day-to-day work 
routine, but also with respect 
to the amount of support and 
interaction that you can count 


“You must be willing to 
ask for what you want 


— and you must be ready 
to settle for what you 
need...but no less.” 


on in times of stress or need. The 
style of the research workplace 
has an impact, and not only on 
how the lab or team operates. 

It also provides insight into 
communication processes, 
distribution of responsibilities, 
credit for grants, presentations, 
manuscripts and other 
professional opportunities. 


Yours You should know the 
components of the salary and 
benefits package that is being 
offered. Questions are essential, 
and if the prospective employer 
is not responsive, you should 
ask yourself why. Determine 
whether their expectations 
mesh with yours. What degree 
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of independence is available, 
and how much independence 
is expected? Will you have 
sufficient mentoring as well 

as supervision? What is the 
evaluation process, and what 
type of feedback on performance 
will you receive? What 
personnel will you have access 
to (laboratory, central or core 
facilities, and administrative 
support)? Will there be 
opportunities to pursue 
activities of importance in your 
long-term goals (teaching, 
mentoring or participation 

in university activities)? 


The process 

Just as in any professional 
setting, competing for a position 
in science can be an arduous 
and sometimes uncomfortable 
process. You must present 
yourself professionally and 
competently. Be assertive 
although not aggressive, with a 
pleasant but persistent tone. 

Prioritize your own needs 
and expectations. Know your 
limits and what you are willing 
to compromise on. During the 
process, figure out what fits and 
what does not meet your needs. 
Remember that each position 
is a step ina life-long career 
process. Although you may not 
know your future goals in detail, 
you should constantly be re- 
evaluating both your short- and 
long-term goals. 

During any interview and 
negotiation process, you must 
be willing to ask for what you 
want. At the same time, you 
must be ready to settle for what 
you need...but no less. There 
are multiple paths to any career 
goal. Use your creativity and 
observational skills to their best 
advantage, not only in your 
scientific research, but also in 
the processes that allow you the 
opportunity to perform your 
life’s work. o 
Joann Boughman is the executive 
vice-president of the American 
Society of Human Genetics in 
Bethesda, Maryland. 


